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d Abstract 

S 

'— ' This paper discusses stochastic models for predicting the long-time behavior of the trajec- 

tories of orbits of the 3x -|- 1 problem and, for comparison, the 5x -|- 1 problem. The stochastic 
J> models are rigorously analyzable, and yield heuristic predictions (conjectures) for the behavior 

of 3x -|- 1 orbits and 5x -|- 1 orbits. 

O 1. Introduction 

The 3x -|- 1 problem concerns the following operation on integers: if an integer is odd "multiply 
by three and add one," while if it is even "divide by two." This operation is given by the Collatz 
•'-j function 

K*N f 3n -h 1 if n = 1 (mod 2) , 

=^ C'(n)= ^ (1.1) 

I - if n = (mod 2) . 

The 3x -|- 1 problem concerns what happens if one iterates this operation starting from a given 
positive integer n. The unsolved 3a; -|- 1 Problem or Collatz problem is to prove (or disprove) 
that such iterations always eventually reach the number 1 (and therefter cycle, taking values 
1,4,2,1). This problem goes under many other names, including: Syracuse Problem, Basse's 
Algorithm, Kakutani's Problem and Ulam's Problem. 



*AVK received support from an NSF Postdoc, grant DMS 0802998. 

^JCL received support from NSF Grants DMS-0500555 and DMS-0801029. 
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The 3x + l Conjecture has now been verified for all n < 5.67 x 10^^ by computer experiments 



1.1. 3x + 1 Function 

There are a number of different functions that encode the 3x+l problem, which proceed through 
the iteration at different speeds. The following two functions prove to be more convenient for 
probabilistic analysis than the Collatz function. The first of these is the 3x + 1 function T{n) 
(or 3x + 1 map) 

^ — if n = 1 (mod 2) , 

r(n) = <; ^ (1.2) 
— if n = (mod 2) . 

This function divides out one power of 2, after an odd input is encountered; it is defined on the 
domain of all integers. 

The second function, the accelerated 3x + 1 function U{n), is defined on the domain of all 
odd integers, and removes all powers of 2 at each step. It is given by 

in which ord2(n) counts the number of powers of 2 dividing n. The function U{n) was studied 
by Crandah [14J in 1978. 

The long-term dynamics under iteration of the 3x + 1 map has proved resistant to rigorous 
analysis. It is conjectured that there is a finite positive constant C so that all trajectories 
eventually enter and stay in the region —C < n < C. In particular, there are finitely many 
periodic orbits and all trajectories eventually enter one of these periodic orbits. On the domain 
of positive integers it is conjectured there is is a single periodic orbit {1,2}; this is part of the 
3x + 1 Conjecture. On the domain of negative integers, the known periodic orbits are the three 
orbits {-1}, {-5, -7, -10} and {-17, -25, -37, -55, -82, -41, -61, -91, -136, -68, -34}. 



1.2. 5x + 1 Problem 

For comparison purposes, we also consider the 5a; + 1 problem, which concerns iterates of the 
Collatz 5x + 1 function 

5n + 1 if n = 1 (mod 2) , 

C^in) = { ^ (1.4) 
— if n = (mod 2) . 
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For this function we also have analogues of the other two functions above. We define the 5x + 1 
function T^{n) (or 5x + 1 map), given by 



1 (mod 2) , 

(1.5) 

(mod 2) . 

It is defined on the set of all integers. 

The second function, the accelerated 5x + 1 function U^^n), is defined on the domain of all 
odd integers, and removes all powers of 2 at each step. It is given by 

in which ord2{n) counts the number of powers of 2 dividing n. 

The long-term dynamics under iteration of the 5a; + 1 map on the integers is conjecturally 
quite different from the 3x + 1 map. It is conjectured that a density one set of integers belong 
to divergent trajectories, ones with \T^^\n)\ — > oo. It is also conjectured that there are a finite 
number of periodic orbits, which include the orbits {1,3,8,4, 2} and {13, 33, 83, 208, 104, 52, 26} 
on the positive integers and the orbit {—1,-2} on the negative integers. An infinite number 
of trajectories eventually enter one of these orbits, but the set of all integers entering each of 
these orbits is believed to have density zero. 



5n + 1 
2 

n 
2 



if n 
if n 



1.3. Stochastic models 

This paper is concerned with probabilistic models for the behavior of the 3x + 1 function it- 
erates, and for comparison, the 5x -|- 1 function iterates. The absence of rigorous analysis of 
the long-term behavior under iteration of these functions provides one motivation to formulate 
probabilistic models of the behavior of the 3x + l map and 52; -|- 1 map. These models can make 
predictions that can be compared to empirical data, which, by uncovering discrepancies, may 
lead to the discovery of new hidden regularities in their behavior under iterations. Note that 
both the 3x -|- 1 map and the 5x -|- 1 map have the positive integers and negative integers as 
invariant subsets; thus their dynamics can be studied separately on these domains. The original 
problems concern their dynamics restricted to the positive integers. 

Here we survey what is known about iteration of these maps, in frameworks which have a 
probabilistic interpretation. A great deal is known about the initial behavior of the iteration of 
the 33; -|- 1 map and 5x -|- 1 map; such results are summarized in ^ and ^ respectively. Here 
some models for the 5x -|- 1 problem are new, developed in parallel with models in Lagarias and 
Weiss [23]. The major unsolved questions have to do with the behavior of long term aspects 
of the iterations. It is here that stochastic models have an important role to play. We present 
models for forward iteration of the map which are of random walk or Markov process type, and 
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models for backwards iteration of the map, which are branching processes or branching random 
walks. Such models can address how the iteration behaves for a randomly selected input value 
n. More sophisticated models address behavior of "extremal" input values. Analysis of these 
latter models typically uses some variant of the theory of large deviations. 



We are interested in using these stochastic models to explore similarities and differences 
between the iteration behavior of the 3x + 1 and 5x + 1 functions. There are many similarities 



which are exact parallels, listed in the concluding [11 The main differences are: in short 
term iteration on the integers Z, 3x + 1 iterates tend to get smaller, while 5a; + 1 iterates 
tend to get larger (in absolute value). For long term iteration it is conjectured that all 3x + 1 
trajectories eventually enter finite cycles; it is conjectured that almost all 5x + 1 trajectories 
diverge. Stochastic models permit making some quantitative versions of this behavior. These 
include the following (conjectural) predictions. 

1. The number of integers 1 < n < x whose 3x + 1 forward orbit reaches 1 is about x'^''^~^°^^\ 
where r/3 = 1. 

2. Restricting to those integers 1 < n < x whose 3x + 1 map forward orbit includes 1, the 
trajectories of most such n reach 1 after about 6.95212 log n steps. 

3. Only finitely many 32; + 1 map trajectories starting at x reach 1 after more than (73 + 
e) log X steps, while infinitely many positive x reach 1 after more than (73 — e) log x steps, 
where 73 « 41.67765. 

4. The number of integers 1 < n < x whose 5x + 1 map forward orbit includes 1 is about 
xV5+o{i)^ where % « 0.65049. 

5. Restricting to those integers 1 < n < x whose 5^; + 1 map forward orbit includes 1, the 
trajectories of most such n reach 1 after about 9.19963 log n steps. 

6. Only finitely many 5x + 1 map trajectories starting at x reach 1 after more than (75 + 
e) log X steps, while infinitely many positive x reach 1 after more than (75 — e) log x steps, 
where 75 84.76012. 



In the case of the 3^; + 1 map, extensive numerical evidence supports these predictions. 
There has been much less computational testing of the 5x + 1 map, and the predictions above 
are less tested in these cases. 

We also survey a number of rigorous results that fit in this framework: these results describe 
aspects of the initial part of the iteration. These include symbolic dynamics for accelerated it- 
eration, given in ^ which were used by Kontorovich and Sinai [18] to show that suitably scaled 
versions of initial trajectories converge in a limit to geometric Brownian motion. These also 
include results on Benford's law for the initial base B digits of the initial iterates of the functions 
above, given in ^ 
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1.4. Contents of the paper 

In ^ through ^ we first consider the 3x + 1 function. Then in |7] and ^ we give comparison 
results for the 5x + 1 problem. In ^ and { 10 we give results on Benford's law and for 2-adic 



generalizations, in parallel for both the 3x + 1 function and 5x + 1 function. 

In ^we discuss the iteration of the 3x + 1 map. We describe its symbolic dynamics, and 
formulate several statistics of orbits, which will be studied via stochastic models in later sec- 
tions. We state various rigorously proved results about these statistics. For a given starting 
value n, these statistics include the X-stopping time ax{n), the total stopping time a"oo(ra), the 
maximum excursion value t{n), and counting functions A^fc(n) and N^{n), for the number of 
backward iterates at depth A; of a given integer a, with the latter only counting iterates that are 
not divisible by 3. We also review what has been rigorously proved about these statistics, and 
give tables of empirical results known about these statistics, found by large scale computations. 
Further data appears in the paper of Oliveira e Silva [31] (in this volume). 

In ^ we discuss stochastic models for a single orbit under forward iteration of the 3x + 1 
map. These include a multiplicative random product model (MRP model) and a logarithmic 
rescaling giving an additive random walk model taking unequal steps (BRW model), which has 
a negative drift. These models predict that all orbits converge to a bounded set, and that the 
total stopping time aoo{n) for the 3a; + 1 map of a random starting point n should be about 
6.95212 log n steps, and as n — > oo have a Gaussian distribution around this value, with stan- 
dard deviation proportional to y/log n. 

In ^ we discuss models for extreme values of the total stopping time of the 3x -|- 1 map. 
We introduce a repeated random walk model (RRW model) which produces a random tra- 
jectory separately for each integer n. We present results obtained using the theory of large 
deviations which rigorously determine behavior in this model of a statistic which is an analogue 
of the scaled total stopping time 7(n) := -^g^- The model predicts that the limit superior 
of these values should be a constant 'Jrrw ~ 41.67765, which is larger than the average value 
6.95212 this variable takes. This prediction agrees fairly well with the empirical data given in ^ 

In ^we survey results concerning forward iteration of the accelerated 3x -|- 1 map. These 
include a complete description of its symbolic dynamics. We also show that a suitable scaling 
limit of these trajectories is a geometric Brownian motion, and discuss the equidistribution of 
various images via entropy. 

In ^ we describe stochastic models simulating backward iteration of the 32; -|- 1 function. 
These models grow random labelled trees, whose levels describe branching random walks. These 
models give exact answers for the expected number of leaves at a given depth k, analogous to 
the number of integers having total stopping time k, and also predict the extremal behavior 
of the scaled total stopping time function 7(n) := It yields a prediction for the limit 

superior of these values to he 'Jbp ^ 41.677647, the same value as for the repeated random 
walk process above. 
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In ^ and ^ we present analogous results for the 5x + 1 map. Much less empirical study 
has been made for iteration of the 5x + 1 function, so there is less empirical data available for 
comparison. 

In ^we define 5x + l statistics of orbits. These are analogues of the 3x + l statistics given in 
^ but some require modification to reflect the fact that 5x + 1 orbits grow on average. We also 
review what is known rigorously about the behavior of this function; in particular the symbolic 
dynamics of the forward iteration of the 5a; + 1 map is exactly the same as that for the 3x + 1 
map. The statistics introduced include a reverse analogue of the stopping time, the -stopping 
time a^{n), and also the total stopping time aoo{n; T5), Since most trajectories are believed to 
be unbounded, the total stopping time is believed to take the value +00 for almost all initial 
conditons. In place of the maximum excursion value, we consider the minimum excursion value 
r(n)! 

In ^ we present results on stochastic models for the 5x + 1 iteration. These include re- 
peated random walk models for the forward iteration of this function, paralleling results of 
Q the convergence to Brownian motion of appropriately scaled trajectories, paralleling results 
of ^ and branching random walk models for inverse iteration, paralleling results of ^ In 
the latter case we present some new results. The most interesting results of the analysis of 
these models is the prediction that the number of integers below x which iterate under the 
5x + 1 to 1 should be about -^jf^j^ ~ 0.65041, and that all integers below x that even- 

tually iterate to 1 necessarily do it in at most (75,bp + o(1)) log a; steps, where j^^bp ~ 84.76012. 

In ^we discuss another property of 3x + l iterates and 5x-|-l iterates: Benford's law. In this 
context "Benford's law" asserts that the distribution of the initial decimal digits of numbers in a 
trajectory {T^''\n) : 1 < k < m} approaches a particular non-uniform probability distribution, 
the Benford distribution, in which an initial digit less than k occurs with probability logj^o ^) so 
that 1 is the most likely initial digit. We summarize results showing that most initial starting 
values of both the 3x -|- 1 map and the 5x -|- 1 map have initial iterates exhibiting Benford-like 
behavior; this property holds for any fixed finite set of initial iterates. 



In §10] we review results on the extensions to the domain of 2-adic integers Z2 of the func- 
tions Ts^n) and T^^n). These functions have the pleasant property that their definition makes 
sense 2-adically, and each function has a unique continuous 2-adic extension, which we denote 
73:^2^ Z2 and T5 : Z2 — > Z2, respectively. These extended maps are measure- preserving for 
the 2-adic Haar measure, and are ergodic in a very strong sense. The interesting feature is that 
at the level of 2-adic extensions the 3x -|- 1 map and 5x -|- 1 map are identical maps from the 
perspective of measure theory. They are both topologically and measurably conjugate to the 
full shift on the 2-adic integers, hence they are topologically and measurably conjugate to each 
other! Thus their dynamics are "the same." This contrasts with the great difference between 
these maps view on the domain of integers. 



In [11 we present concluding remarks, summarizing this paper, comparing properties under 
iteration of the 3x -|- 1 map and 5x -|- 1 map. The short-run behavior under iteration of these 
maps have some strong similarities. However all evidence indicates that the long-run behavior 
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of iteration for the 3x + 1 map and the 5a; + 1 map on the integers Z is very different. We also 
hst a set of insights and topics for further investigation. 

Notation. For convenience, when comparing the 3x + 1 maps with the corresponding 5x + 1 
maps, we may write C3{n) , T3{n) , 1/3(71) in place of C{n),T(n),U{n) above. 

Acknowledgments. The authors thank Steven J. Miller for a careful reading of and many 
corrections to an earlier draft of this manuscript. AVK wishes to thank the hospitality of Dorian 
Goldfeld and Columbia University during this project. 

2. The 3x + 1 Function: Symbolic Dynamics and Orbit Statistics 

In this section we consider the 3x + 1 map T{n). We recall basic properties of its symbolic 
dynamics. We also define several different statistics for describing its behavior on individual 
trajectories, and summarize what is rigorously proved about these statistics. In later sections 
we will present probabilistic models which are intended to model the behavior of these statistics. 

2.1. 3x + 1 Symbolic Dynamics: Parity Sequence 

The behavior of the map T{n) under iteration is completely described by the parities of the 
successive iterates. 

Definition 2.1 (i) For a function T : Z ^ Z and input value n £ Z define the parity sequence 
of n to be 

S{n) := (n (mod 2),T{n) (mod 2),T'^^\n) (mod 2),...) (2.1) 

in which T^^\n) denotes the fc-th iterate, so that T^'^\n) := T(T{n)). This is an infinite vector 
of zeros and ones. 

(ii) For A; > 1 its k-truncated parity sequence is a vector giving the initial segment of k terms 
of S'(n), i.e. 

S^^\n) ■= (n (mod 2),T{n) (mod 2),T^'^\n) (mod 2),- ■ ■ ,T^^-^\n) (mod 2)). (2.2) 

A basic result on the iteration is as follows. 

Theorem 2.1 (Parity Sequence Symbolic Dynamics) The k-truncated parity sequence 5W( n] 
of the first k iterates of the 3x + 1 map T{x) is periodic in n with period 2^. Each of the 2^ 
possible — 1 vectors occurs exactly once in the initial segment 1 < n <2^ . 

Proof. This result is due to Terras |38] in 1976 and Everett |16] in 1977. A proof is given as 
Theorem B in Lagarias |21j . ■ 

An immediate consequence is that an integer n is uniquely determined by the parity sequence 
S{n) of its forward orbit. To see this, note that any two distinct integers fall in different residue 
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classes ( mod 2^) for large enough k, so will have different parity sequences. The parity sequence 
thus provides a symbolic dynamics which keeps track of the orbit. Taken on the integers, only 
countably many different parity sequences occur out of the uncountably many possible infinite 
— 1 sequences. 

2.2. 3x + 1 Stopping Time Statistics: A-stopping times 

The initial statistic we consider is the number of iteration steps needed to observe a fixed 
amount of decrease of size in the iterate. 

Definition 2.2 For fixed A > the X-stopping time (Jx{n) of a map T : Z — > Z from input n 
is the minimal value of A; > such that T^^\n) < An, e.g. 

axin) :=inf <; > : ^^^^^""^ < a| . (2.3) 



If no such value k exists, we set ax{n) = 

This notion for A = 1 was introduced in 1976 by Terras |38j who called it the stopping time, 
and denoted it a{n). The more general A-stopping time is interesting in the range < A < 1; 
it satisfies (7x{n) = for all A > 1. 

Terras [38^ studied the set of numbers having stopping time at most k, denoted 

Si{k) := {n : ai{n) < k}. (2.4) 



He used Theorem 2.1 to show ([38]) 129]) that this set of integers has a natural density, as 
defined below, and that this density approaches 1 as /c ^ oo. 

Later this result was generalized. Rawsthorne [32 in 1985 introduced the case of general 
A, and Borovkov and Pfeifer [lUI Theorem 2] in 2000 considered criteria with several stopping 
time conditions. 

There are several notions of density of a set S of the natural numbers N = {1,2,3,...}. The 
lower asymptotic density ID)(S) is defined for all infinite sets S, and is given by 

D(S) := liminf -|{n G S : n < (2.5) 

i^oo t 

The assertion that an infinite set S C N of natural numbers has a natural density D(S) is the 
assertion that the following limit exists: 

D(i;) := lim -\{n G S : n < t}\. (2.6) 

t^oo t 

Sets with a natural density automatically have D(S) = D(S). 

Theorem 2.2 (A-Stopping Time Natural Density) 

(i) For the 3x + 1 map T{n), and any fixed < A < 1 and k > 1, the set Sx{k) of integers 
having X-stopping time at most k has a well-defined natural density I}{Sx{k)). 

(a) For X fixed and k — > oo, this natural density satisfies 

D{Sx{k)) ^ 1. (2.7) 
In particular, the set of numbers with finite X-stopping time has natural density 1. 
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Proof. For the special case A = 1, that is the stopping time, this is the basic result of Terras 
j38| . |39) . obtained also by Everett [16j. A proof for A = 1 is given as Theorem A in Lagarias 



JT]. The idea is that by Theorem 2.1, each arithmetic progression (mod 2^) has iterates that 
multiply by a certain pattern of ^ or | for the first k steps. A certain subset of the 2''-arithmetic 
progressions ( mod 2^) will have the product of these numbers fall below A, and these arithmetic 
progressions give the density. To see that the density goes to 1 as /c — > oo, one must show that 



most arithmetic progressions (mod2'^) have a product smaller than one. Theorem 2.1 says that 

all products occur equally likely, and since the geometric mean of these products is ^ < 1, 
one can establish that such a decrease occurs for all but an exponentially small set of patterns, 
of size 

Q,^20-94995fc) oT^t of 2*= possible patterns. One can show a similar result for decrease by a 
factor of any fixed A, and a proof of natural density for general A > is given in Borovkov and 
Pfeifer ^ Theorem 3]. - 

The results above are rigorous results, and therefore we have no compelling need to find 
stochastic models to model the behavior of stopping times. Nevertheless stochastic models 
intended to analyze other statistics produce in passing models for stopping time distributions. 



In ^3.1 we present such a model, which gives an interpretation of these stopping time densities 



as exact probabilities of certain events. 



Remark. The analysis in Theorem 2.2 treats A as fixed. In fact one can also prove rigorous 



results which allow A to vary slowly (as a function of n), under the restriction that A < log2ra. 

2.3. 3x + 1 Stopping Time Statistics: Total Stopping Times 

The following concept concerns the speed at which positive integers iterate to 1 under the map 
T, assuming they eventually get there. 

Definition 2.3 The total stopping time (Too{n) for iteration of the 3a; + 1 map T(n) is defined 
for positive integers n by 

aoo(n) := inf{A; > : T^'''>{n) = 1}. 
We set aoo{n) = +oo if no finite k has this property. 

The 3x + 1 Conjecture asserts that all positive integers have a finite total stopping time. 

Concerning lower bounds for this statistic, there are some rigorous results. First, since each 
step decreases n by at most a factor of 2, we trivially have 

log n 

aoo(n) > ^ ^ 1.4426 log n. 
log 2 

The strongest result on the existence of integers having a large total stopping time is the 
following result of Applegate and Lagarias fS', Theorem 1.1]. 
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Theorem 2.3 (Lower Bound for 3a; + 1 Total Stopping Times) There are infinitely many n 
whose total stopping time satisfies 



o-oo(n) > ^ 



29 

29 log 2 - 14 log 3 



log n 6.14316 log n. (2.^ 



Nothing has been rigorously proved about either the average size of the total stopping time, 
or about upper bounds for the total stopping time (since such would imply the main conjec- 
ture!). This provides motivation to study stochastic models for this statistic, to make guesses 
how it may behave. 

The various stochastic models discussed in ^ as well as empirical evidence given below, 
suggest that the size of this statistic will always be proportional to log n. This motivates the 
following definition. 

Definition 2.4 For n > 1 the scaled total stopping time 7oo(^) of the 3x + 1 function is given 

by 

7oo(n) := (2.9) 
iogn 

This value will be finite for all positive n only if the 3x + 1 conjecture is true. 

A stochastic model in ^ makes strong predictions about the distribution of scaled total 
stopping times: they should have a Gaussian distribution with mean 

/I, 4\-i 
^ := log- w 6.95212 



.2 "'3 

and variance 



1, /I, 4\i 
cT:=-log3(-log. 



cf. Theorem 3.2 In particular, half of all integers ought to have a total stopping time 



Coo(™) ^ /ilogn ~ 6.95212 log n. It seems scandalous that there is no unconditional proof 



that infinitely many n have a stopping time at least this large, compared to the bound (2.8) in 
Theorem 12.31 above! 



We next define a limiting constant associated with extremal values of the scaled total stop- 
ping time for the 2)X + 1 map. 



Definition 2.5 The 3x -|- 1 scaled stopping constant is the quantity 

7 = 73 := limsup7oo(n.) = limsup ^ . (2.10) 

n— >oo n^oo iog n 

We now give empirical data about these extremal values. Table [T] presents empirical data 
on record holders for the function 700(^^)1 compiled by Roosendaal |33]. This table also includes 
data on another statistic called the ones-ratio (or completeness), taken from Roosendaal [331 
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k 


^k-th record rik 




ones(nfc) 


ones — ratio 


7oo(^^fc) 


1 


3 


5 


2 


0.400000 


4.551196 


2 


7 


11 


5 


0.454545 


5.652882 


3 


9 


13 


6 


0.461358 


5.916555 


4 


27 


70 


41 


0.585714 


21.238915 


5 


230 631 


278 


164 


0.589928 


22.512720 


6 


626 331 


319 


189 


0.592476 


23.899366 


7 


837 799 


329 


195 


0.592705 


24.122828 


8 


1 723 519 


349 


207 


0.593123 


24.303826 


9 


3 732 423 


374 


222 


0.593583 


24.714906 


10 


5 649 499 


384 


228 


0.593750 


24.699176 


11 


6 649 279 


416 


248 


0.596154 


26.479917 


12 


8 400 511 


429 


256 


0.596737 


26.907006 


13 


63 728 127 


592 


357 


0.603041 


32.943545 


14 


3 743 559 068 799 


966 


583 


0.603520 


33.366656 


15 


100 759 293 214 567 


1134 


686 


0.604938 


35.169600 


?16 


104 899 295 810 901 231 


1404 


850 


0.605413 


35.823841 


?17 


268 360 655 214 719 480 367 


1688 


1022 


0.605450 


35.885221 


?18 


6 852 539 645 233 079 741 799 


1840 


1115 


0.605978 


36.595864 


?19 


7 219 136 416 377 236 271 195 


1848 


1120 


0.606061 


36.716918 



Table 1: Record Values for 7oo(^) and for ones-ratio(n). 



Completeness and Gamma Records]. The function ones(n) counts the number of odd iterates 
of the 3x + 1 function to reach 1 starting from n (including 1), and 

ones-ratio(n) := ones(n)/(Too(?T-). (2-11) 

Table [T] shows that the function 7(n) is not a monotone increasing function of the ones-ratio, 
compare rows 9 and 10. The values with question marks mean that all intermediate values have 
not been searched, so these values are not known to be record holders. 

In ^ we present a stochastic model which makes a prediction for the extremal value of 
7. A quite different model is discussed in f|7j which makes exactly the same prediction! For 
both models the analogue of the constant 7 := limsup 7oo(n.) exists and equals a constant 
which numerically is approximately 41.677647, with corresponding ones-ratio of about 0.609091. 
Compare these predictions with the data in Table [TJ 

2.4. ?)X + 1 Size Statistics: Maximum Excursion Values 

Another interesting statistic is the maximum value attained in a trajectory, which we call the 
maximum excursion value. 



Definition 2.6 The maximum excursion value t{n) is the maximum value occurring in the 
forward iteration of the integer n, i.e. 

t{n) := max {T^^\n) : /t >0), (2.12) 

with t{n) = +00 if the trajectory is divergent. 
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k 


^k-th record n\ 




r{nl) 




1 


2 


2 


0.500 


1.000 


2 


3 


8 


0.889 


1.893 


3 


7 


26 


0.531 


1.674 


4 


15 


80 


0.356 


1.618 


5 


27 


4 616 


6.332 


2.560 


6 


255 


6 560 


0.101 


1.586 


7 


447 


19 682 


0.099 


1.620 


8 


639 


20 782 


0.051 


1.539 


9 


703 


125 252 


0.253 


1.792 


10 


1 819 


638 468 


0.193 


1.781 


11 


4 255 


3 405 068 


0.188 


1.800 


12 


4 591 


4 076 810 


0.193 


1.805 


13 


9 663 


13 557 212 


0.145 


1.790 


14 


20 895 


25 071 632 


0.057 


1.712 


15 


26 623 


53 179 010 


0.075 


1.746 


16 


31 911 


60 506 432 


0.059 


1.728 


17 


60 975 


296 639 576 


0.080 


1.771 


18 


77 671 


785 412 368 


0.130 


1.819 


19 


113 383 


1 241 055 674 


0.097 


1.799 


20 


138 367 


1 399 161 680 


0.073 


1.779 


21 


159 487 


8 601 188 876 


0.338 


1.861 


22 


270 271 


12 324 038 948 


0.169 


1.858 


23 


665 215 


26 241 642 656 


0.059 


1.789 


24 


704 511 


28 495 741 760 


0.057 


1.788 


25 


1 042 431 


45 119 577 824 


0.042 


1.770 



Table 2: Seeds n giving record heights for 2>x + 1 maximum excursion value t{n). 



The quantity t{n) will be finite for all n if and only if there are no divergent trajectories for 
the 3a; + 1 problem (but does not exclude the possibility of as yet unknown loops) . 

We define the following extremal statistic for maximum excursions. 

Definition 2.7 Let the 3x + 1 maximum excursion ratio be given by 

logt(n) 

p{n) := . (2.13) 

log n 

Then the 3x + 1 maximum excursion constant is the quantity 

\ogtin) 

p := limsup p{n) = limsup — . (2-14) 

n— >oo 71— >oo log n 

The maximal excursion constant will be +oo if there is a divergent trajectory. The fact 
that the logarithmic scaling used in defining this constant is the "correct" scaling is justified 
by empirical data given in Oliveira e Silva [31] (in this volume) and by the predictions of the 
stochastic model given in ^ As explained in |4.3[ the stochastic model prediction for the 
maximum excursion constant is p = 2. 
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Figure 2.1: A plot of n versus the maximal excursion ratio p{n) for 3 < n < 1 042 431 and odd, 
cf. (2.13). The only seeds n in this range with p{n) > 2 are n = 27, 31, 41, 47, 55, and 63 
(which all look at this scale as if they are on the y-axis). 



n 


t{n) 


r(n) 


p{n) 


27 


4 616 


6.332 


2.560 


319 804 831 


707 118 223 359 971 240 


6.914 


2.099 


1 410 123 943 


3 562 942 561 397 226 080 


1.792 


2.028 


3 716 509 988 199 


103 968 231 672 274 974 522 437 732 


7.527 


2.070 


9 016 346 070 511 


126 114 763 591 721 667 597 212 096 


1.551 


2.015 


1 254 251 874 774 375 


1 823 036 311 464 280 263 720 932 141 024 


1.159 


2.004 


1 980 976 057 694 848 447 


32 012 333 661 096 566 765 082 938 647 132 369 010 


8.158 


2.050 



Table 3: Values of n for which the maximal excursion ratio p(n) = '"^g^"^ > 2 (equivalently, 
r(n) = t{n)/v? > 1), culled from Oliveira e Silva's [Sli Table 8] record t{n) values. 
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In Table [2] we give the set of initial champion values for the maximum excursion, extracted 
from data of Oliveira e Silva j3D]. For comparison we give for each the ratio r(n) := and 

the value of the maximal excursion ratio p{n) = ■ It is also useful to examine the larger 

table to 10^® given in Oliveira e Silva j31] . 

While record values of t{n) have received tremendous computational attention, there has 
not been a substantial amount of effort put into congregating those n with large p(n) (the 
difference being that the former seeks seeds n with large values of t{n), whereas the latter seeks 
large values of t{n) relative to the size of n). We have computed that the only seeds n < 10^ 



for which p{n) > 2 are: n G {27,31,41,47,55,63}, cf. Figure 2.1 



Nevertheless, some "large" values of p{n) already appear in tables of large t(n)'s. In Table 
[3] we extract from a table of t{n) champions computed by Oliveira e Silva ^31j the subset of 
seeds n for which p(n) > 2, i.e. 

r(n) = ^ > 1. 



Only seven such values appear. This data seems to (however weakly) support Conjecture 4.2 



2.5. 3x + 1 Count Statistics: Inverse Iterate Counts 

In considering backwards iteration of the 3x + 1 function, we can ask: given an integer a how 
many numbers n have T^^\n) = a, that is, iterate forward to a after exactly k iterations? 

The set of backwards iterates of a given number a can be pictured as a tree; we call these 
3x + 1 trees and describe their structure in ^ Here Nj^{a) counts the number of leaves at depth 
/c of a tree with root node a, and N^(a) counts the number of leaves in a pruned 3a; + 1 tree, in 
which all nodes with label n = (mod 3) have been removed. The definitions are as follows. 

Definition 2.8 (1) Let Nk{a) count the number of integers that forward iterate under the 
3x + 1 map T(n) to a after exactly k iterations, i.e. 

Nk{a) := |{n: rW(n) = a}|. (2.15) 

(2) Let N^(a) count the number of integers not divisible by 3 that forward iterate under 
the 3x + 1 map T(n) to a after exactly k iterations, i.e. 

Ar*(a) := |{„ ; T'-''\n) = a, n ^ O(mod 3)}|. (2.16) 

The case a = 1 is of particular interest, since the quantities then count integers that iterate 
to 1. We set 

Nk:=Nk{l), N*,:=NUl). 
The secondary quantity N^(a) is introduced because it is somewhat more convenient for anal- 
ysis. It satisfies the monotonicity properties N^{a) < N^_^_l{a) and 

k 

Nl{m) < Nk{a) < ^ iV;(m) < (fc + l)Nl{a). 

j=0 
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We have the trivial exponential upper bound 

Nkia) < 2\ (2.17) 

since each number has at most 2 preimages. We are interested in the exponential growth rate 
oiNu{a). 

Definition 2.9 (1) For a given a the 2>x + 1 tree growth constant 6^{a) is given by 

53(a) := limsupi (logiVfc(a)) . (2.18) 
k—>oo 

(2) The 3x + 1 universal tree growth constant is 6 = 63 = ^3(1). 



The constant 63(a) exists and is finite, as follows from the upper bound (2.17). It is easy 
to prove unconditionally that (53(3o) = 0, because the only preimages of a number 3a are 2^3a 
and A'^fc(3a) = 1 for all A; > 1. The interesting case is when a ^ (mod 3). 

Applegate and Lagarias [2j determined by computer the maximal and minimal number of 
leaves in pruned 32; + 1 trees of depth k for A; < 30. The maximal and minimal number of leaves 
in such trees at level k is given by 

Ar+ := max{iV^(a) : a (mod 3'"+^) with a ^ (mod 3)} 

and 

A^^" := mm{N^{a) : a (mod 3''+^) with a ^ (mod 3)}, 

respectively. Counts for the number of leaves in maximum and minimum size trees of various 
depths k are given in the following table, taken from Applegate and Lagarias ([2], |3]). It is 

known that the average number of leaves at depth k (averaged over a) is proportional to , 

therefore in Table |4] below we include the value (|)*^ and the scaled statistics 

'4\ 
.3. 



This table also gives the number of distinct types of trees of each depth (there are some sym- 
metries which speed up the calculation). 

Applegate and Lagarias [2, Theorem 1.1] proved the following result by an easy induction 
using this table. 

Theorem 2.4 (3x-|- 1 Tree Sizes) For any fixed a ^ (mod 3) and for all sufficiently large k, 

(1.302053)'= < Nl{a) < (1.358386)''. (2.19) 

In consequence, for any a ^ O(mod 3), 

log(1.302053) < 53(a) < log(1.358386). (2.20) 

We describe probabilistic models for 33; + 1 inverse iterates in ^ The models are Galton- 
Watson processes for the number of leaves in the tree, and branching random walks for the sizes 
of the labels in the tree. The model prediction is that 53(a) = log for all a ^ (mod 3). 
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k 


# tree types 






ilY 


ft 




1 


4 


1 


2 


1.33 


0.750 


1.500 


2 


8 


1 


3 


1.78 


0.562 


1.688 


3 


14 


1 


4 


2.37 


0.422 


1.688 


4 


24 


2 


(J 


3.10 


0.033 


1.898 


5 


42 


2 


8 


4.21 


0.475 


1.898 


6 


76 


3 


10 


5.62 


0.534 


1.780 


7 


138 


4 


14 


7.49 


0.534 


1.869 


8 


254 


5 


18 


9.99 


0.501 


1.802 


9 


470 


6 


24 


13.32 


0.451 


1.802 


10 


876 


9 


32 


17.76 


0.507 


1.802 


11 


1638 


11 


42 


23.68 


0.465 


1.774 


12 


3070 


16 


55 


31.57 


0.507 


1.742 


lo 


o7oD 


on 


74 


42. uy 




1.700 


14 


10850 


27 


100 


56.12 


0.481 


1.782 


15 


20436 


36 


134 


74.83 


0.481 


1.791 


16 


38550 


48 


178 


99.77 


0.481 


1.784 


17 


72806 


64 


237 


133.03 


0.481 


1.782 


18 


137670 


87 


311 


177.38 


0.490 


1.753 


19 


260612 


114 


413 


236.50 


0.482 


1.746 


20 


493824 


154 


548 


315.34 


0.488 


1.738 


21 


936690 


206 


736 


420.45 


0.490 


1.751 


22 


1778360 


274 


988 


560.60 


0.489 


1.762 


23 


3379372 


363 


1314 


747.47 


0.486 


1.758 


24 


6427190 


484 


1744 


996.62 


0.486 


1.750 


25 


12232928 


649 


2309 


1328.83 


0.488 


1.738 


26 


23300652 


868 


3084 


1771.77 


0.490 


1.741 


27 


44414366 


1159 


4130 


2362.36 


0.491 


1.748 


28 


84713872 


1549 


5500 


3149.81 


0.492 


1.746 


29 


161686324 


2052 


7336 


4199.75 


0.489 


1.747 


30 


308780220 


2747 


9788 


5599.67 


0.491 


1.748 



Table 4: Normalized extreme values for 3x + 1 trees of depth k 



16 



2.6. 3x + 1 Count Statistics: Total Inverse Iterate Counts 

In considering backwards iteration of the 3a; + 1 function from an integer a, complete data is 
the set of integers that contain a in their forward orbit. The 3x + 1 problem concerns exactly 
this question for a = 1. The following function describes this set. 

Definition 2.10 Given an integer a, the inverse iterate counting function vra(x) counts the 
number of integers n with |n| < x that contain a in their forward orbit under the 3x + l function. 
That is, 

Tr^{x) := #{n : \n\ < x and X'^'^^n) = a for some A; > 0}. (2.21) 

It is possible to obtain rigorous lower bounds for this counting function. For a = (mod 3) 
the set of inverse iterates is exactly {2^a : A; > 0} and TTa{x) = [log2(^)J grows logarithmically. 
If a ^ (mod 3) then TTa^x) satisfies a bound 7ra{x) > x'^ for some positive c, as was first shown 
by Crandall [H] in 1978. The strongest method currently known to obtain lower bounds on 
7ra(x) was initiated by Krasikov [W in 1989, and extended in [3], |20j . It gives the following 
result. 

Theorem 2.5 (Inverse Iterate Lower Bound) For each a ^ (mod 3), there is a positive 
constant XQ^a) such that for all x > xo(a), 

TTa{x) > X°-«^ (2.22) 



Proof. This is proved in Krasikov and Lagarias |i20j. The proof uses systems of difference 
inequalities (mod 3^^), analyzed in Applegate and Lagarias [3], and by increasing k one gets 
better exponents. The exponent above was obtained by computer calculation using k = 9. ■ 



The following statistics measure the size of the inverse iterate set in the sense of fractional 
dimension. 



Definition 2.11 Given an integer a, the upper and lower 3x + 1 growth exponents for a are 
given by 

+ / N 1- logvra(x) 
V3W :=hmsup— — — , 

X— >CXD iOg X 

and 

-f ^ y ■ .log7ra(x) 

r?o (a) := hmmi — . 

x-»oo log X 

If these quantities are equal, we define the 3x + 1 growth exponent r]3{a) to be 773(a) = 7?^(a) = 
% (a). 

We clearly have 7/3(0) = if a = (mod 3). For the remaining values Applegate and 
Lagarias made the following conjecture. 

Conjecture 2.1 (3x + 1 Growth Exponent Conjecture) For all integers a ^ (mod 3), the 
3x + 1 growth exponent 773(a) exists, with 

m{a) = 1. (2.23) 
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The truth of the 3x + 1 Conjecture would imply that 7/3(1) = 1; however it does not seem 
to determine 7/3(0) for all such a. Applegate and Lagarias P] Conjecture A] made the stronger 
conjecture that for each o ^ (mod 3) TTa{x) grows linearly, i.e. there is a constant Cq > 
such that TTaix) > CaX holds for all x > 1. 



Note that Theorem 



2.5 



shows that rj^ia) > 0.84 when a ^ ( mod 3). Thus the lower bound 
in Conjecture 2.1 thus seems approachable. A stochastic model in §6.5| makes the prediction 
that 773(a) = 1. 



3. 3x + 1 Forward Iteration: Random Product and Random Walk Models 



In this section we formulate stochastic models intended to predict the behavior of iterations 
of the 3x + 1 map T(n) on a "random" starting value n. These models are exactly analyzable. 
We describe results obtained for these models, which can be viewed as predictions for the "av- 
erage" behavior of the 3x + 1 function. 



3.1. Multiplicative Random Product Model and A-stopping times 

Recall that the A-stopping time is defined (see (|2.3[)) by 



axin) := inf{A: > : ^^!lM < A}. 

n 

Rawsthorne [32| and Borovkov and Pfeifer |10j obtained a probabilistic interpretation of the A- 
stopping time, as follows. They consider a stochastic model which studies the random products 

Yk '■= X1X2 ■ ■ ■ Xk, 

in which the Xi are each independent identically distributed (i.i.d.) random variables Xi having 
the discrete distribution 

3 



- with probability ^, 



Xi 



with probability |. 



2 

We call this the 3x -|- 1 multiplicative random product {2>x + 1 MRP) model. 

This model does not include the choice of the starting value of the iteration, which would 
correspond to Xq; the random variable really models the ratio — ' . They define for 
A > the X-stopping time random variable 

Vx{lo) := mi{k : ^ < A}, (3.1) 

where uj = (Xi, X2, X^, . . .) denotes a sequence of random variables as above. This random 
vector cu will model the effect of choosing a random starting value n = Xq in iteration of the 
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3x + 1 map. 



This stochastic model can be used to exactly describe the density of A-stopping times, as 
follows. Let F[E] denote the probability of an event E. 



Theorem 3.1 (A-Stopping Time Density Formula) For the 3x + 1 function T{n) the natural 
density n{S\{k)) for integers having X-stopping time at most k is given exactly by the formula 



KSxm=nVxiu;)<k], 



(3.2) 



in which Vx is the X-stopping time random variable in the 3x + 1 multiplicative random product 
(MRP) model. 

Proof. In 1985 Rawsthorne {321 Theorem 1] proved a weaker version of this result, with 
D{Sx{k)) replaced by the lower asymptotic density D(Sx{k)). The result, using natural density, 
is a special case of Borovkov and Pfeifer \10\ Theorem 3] . ■ 

It is natural to apply the 3x+l MPR model with an initial condition added, which is a proxy 
for the expected behavior of the total stopping time. To do this we must allow variable A (as a 
function of n) , in a range of parameters where there is no rigorous proof that the model behavior 
agrees with that of iteration of the map T(n), namely for A = alogn with various a > 1. What 
is missing is a result saying that it accurately matches the behavior of iteration of the 3a;+l map. 

The behavior of the resulting probabilistic model is rigorously analyzable, as we discuss in 
the next subsection, cf. Theorem 13.21 below. 



3.2. Additive Random Walk Model and Total Stopping Times 

The 3x + 1 iteration takes xq = n and Xk = T^^\n). In studying the iteration, it is often more 
convenient to use a logarithmic scale and set = log x^ (natural logarithm) so that 

Vk = logXfc := log T('')(r ' 



Then we have 



with 



Vk+i 



yfc + log I + efc if x = 1 (mod 2) , 



Vk + log 5 
efc := log ( 1 + 



if X = (mod 2) , 
3xfc) ■ 



(3.3) 



(3.4) 



Here et is small as long as \xk\ is large. 
Theorem 



2.1 



implies that if an integer is drawn at random from [1, 2^] then its fc-truncated 
parity sequence will be uniformly distributed in {0,1}'^. In consequence, equations (3.3) and 



(3.4) show that the quantities logr('^)(n) (natural logarithm) can be modeled by a random 
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walk starting at initial position = log n and taking steps of size log | if the parity value is 
odd, and log ^ if it is even. 

The MRP model considered before is converted to an additive model by making a logarithmic 
change of variable, taking new random variables Wk '■= logXk- The additive model considers 
the random variables Zf^. which are a sum of random variables 

Zk := Zo + logYk = Zo + Wi + W2 + --- + Wk. 

Here Zq is a specified initial starting point, and Z^ is the result of a (biased) random walk, 
taking steps of size either log | or log ^ with equal probability. In terms of these variables, the 
A-stopping time random variable above is 

Vx{uj) = M{k : Zk- Zq < log A}. 

We consider the approximation of this iteration process by the following stochastic model, 
which we term the 3x + 1 Biased Random Walk Model (3x + 1 BRW Model). For an integer 
n > 1 it separately makes a random walk which takes steps of size log ^ half the time and log | 
half the time. We can write such a random variable as 

■= -log2 + 5fclog3, 

in which 5k are independent Bernoulli zero-one random variables. The random walk positions 
{Zk : A; > 0}, are then random variables having starting value Zq = logn, and with 

Zk := ^0 + 6 + ^2 + --- + efe- 
The Zk define a biased random walk, whose expected drift /x is given by 

fi := E[^k)] = - log 2 + ^ log 3 = ^ log « -0.14384. (3.5) 
The variance a of each step is given by 

a := \ai[^k] = ^ log3 s:^ 0.54930. 
In the addive model we associate to a random walk a total stopping time random variable 
Socin) := mm{k > : Zk < 0, given Zq = logn}, 

which detects when the walk first crosses (this corresponds in the multiplicative model to 
reaching 1). The expected number of steps to reach a nonpositive value starting from Zq = log n 
is 

E[Soo{n)] = ^loga= logn ^ 6.95212 logn. 

As noted in ^ Borovkov and Pfeifer [TU] consider the multiplicative stochastic model ob- 
tained by exponentiation of the positions of the biased random walk above, from a given starting 
value Xq = e*^". They conclude the following result [10, Theorem 5]. 
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Theorem 3.2 (3X + 1 BRW Gaussian Limit Distribution) In the Biased Random Walk Model, 
for each fixed n >2 define the normalized random variable 

S^n) - Mogn 

'^ooi'n) .— -3 ; 

fi 2 (TV log n 

which has cumulative distribution function Pn{x) := Prob[Zoo(?^) < x]. Here n = l^log ||, and 
cr = ^ log 3. Then for each fixed real x, allowing n to vary, one has 

Pn[x) := Prob[Zoo(n) < x] — > ^[x), as n — > 00, 

where $(x) = -^j^ /-oo^~^* cumulative distribution function of the standard normal 

distribution A^(0, 1). 

Borovkov and Pfeifer note further that the rate of convergence of the normalized distribution 
Pn{x) with fixed n to the limiting normal distribution as n ^ 00 is uniform in x, but is quite 
slow. They assert that for all n > 2 and all — 00 < x < 00, 

|P„(x) -$(x)| =o((logn)-i) , (3.6) 

where the implied constant in the 0-symbol is absolute. 



They also propose a better approximation to the distribution of the total stopping time of a 
random integer of size near n, reflecting the fact that it is nonnegative random variable. They 
assert that the rescaled variable 

Soo{n) 



Yoo{n) :-- 

appro 

having the distribution function 



log n 

should have a good second order approximation given by the nonnegative random variable Y{n) 



^nix) = Cn^^^^ I -y=e- ■ 2.-^t - dt, X>0. 

o" Jo V27rt3 
in which C„ is a normalizing constant ([10', eqn. (25)]). 

They view the random variable Soo{n) as providing a model for the total stopping time 
(7oo(ra) of the 3x + l function, where one compares the ensemble of values {cToo(n) : x < n < cix} 
with ci > 1 fixed with independent samples of values 5oo {n) . The result above (with error term 



O I ) predicts that for any e > the number of values that do not satisfy 

* /logn j 

i — logx < croo(n) < - H i — logx 

(logx)2~V (logx)2-V 

iy compare the distribution of ^(n) 
3x + 1 function for n ~ 10^ and find fairly good agreement 



is o(x), as X — > 00. They compare the distribution of Y{n) with numerical data "^^j-"^ for the 



log n 

6 
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Figure 3.2: Histograms for ctoo (xq) / In xq and its stochastic analog T{n) / In n with fitted density. 
Taken from Borovkov-Pfeifer |10] . 
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4. 3a; + 1 Forward Iteration: Large Deviations and Extremal Trajectories 



Lagarias and Weiss [23] formulated and studied stochastic models which are intended to 
give predictions for the extremal behavior of iteration of the 3x + 1 map T{n). 



4.1. 3X + 1 Repeated Random Walk Model 

Lagarias and Weiss studied the following Repeated Random Walk Model (3x + 1 RRW Model). 
For each integer n > 1, independently run a 3x + 1 biased random walk model trial with 
starting value Z^^n = log n. That is, generate an infinite sequence of independent random walks 
{Zk^n • k > 0}, with one walk generated for each value of n. The model data is the countable 
set of random variables 

oj := {Zk,n : n > 1, > 0}, (4.1) 

in which the initial starting points Zo^n '■= log n are deterministic, and all other random variables 
stochastic. From this data, one can form random variables that are functions of uj, correspond- 
ing to the total stopping times and the maximum excursion values above. 

The 3x + 1 RRW model is exactly analyzable, and makes predictions for the value of the 
scaled stopping time constant, and for the maximum excursion constant. A subtlety of the 
RRW model is the fact that there are exponentially many trials with inputs of a given length 
j, namely for those n with < n < e-^"*"^, which have initial condition j < Zq^^ < J + 1, so that 
the theory of large deviations becomes relevant to the analysis. 



4.2. 3x + 1 RRW Model Prediction: Extremal Total Stopping Times 

The 3x + 1 RRW model can be used to produce statistics analogous to the scaled total stopping 
time 7oo('^) and the 3x + 1 scaled stopping time constant 7, cf. (2.9) and (2.10). 

For a given trial lo it yields an infinite sequence of total stopping time random variables 
Soo{uj) := (500(1), S'oo(2), ^00(3), . . . , 5oo(ra), . . .), 



where Soo{n) is computed using the individual random walk 7^„. Thus we can compute the 
scaled statistics for n > 2, and set 

log n — ' 

f ^ r Soo{n) 

7(w) := imisup . 

n^oo log n 

as a stochastic analogue of the quantity 7. 

The 3x + 1 RRW model has the following asymptotic limiting behavior for this statistic, 
given by Lagarias and Weiss |22 Theorem 2.1]. 
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Theorem 4.1 (3x + 1 RRW Scaled Stopping Time Constant) For the 3a; + 1 RRW model, 
with probability one the scaled stopping time 

'jiuj) := iimsup 

n-*oo log n 

is finite and equals a constant 

IRRW ~ 41.677647, 

which is the unique real number 7 > (^5 log |^ ~ 6.952 of the fixed point equation 

7 5 (^) = 1. (4.2) 

Here the rate function g{a) is given by 

g{a) := sup {Oa - log MRRw{e)) , (4.3) 

in which 

Mrrw{9):=\(2'+{^^'^ (4.4) 

is a moment generating function associated to the random walk. 

Lagarias and Weiss also obtain a density result on the number of n getting values close to 
the extremal constant, as follows ( [23| Theorem 2.2]). 

Theorem 4.2 (3^; + 1 RRW Scaled Stopping Time Distribution) For the 3x + 1 RRW model, 
and for any constant a satisfying 

1 4 

log- <Oi<-iRRw, (4.5) 



one has the bound 



E 



2 ° 3 



I r ^ \ T I 

\\n < X : > a\\ 

log n 



-1 



<[l-ag[^]] (46) 



In the reverse direction, for any e > this expected value satisfies 



E 



\\n < X : > Of 

logn 



> 2;i-"9(i/")-^ (4.7) 



for all sufficiently large x > xq^c). 

This theorem says that not only is there an upper bound jrrw on the asymptotic limiting 
value of the stopping ratio, but the set of n for which one gets a value above a becomes very 
sparse (in the logarithmic sense) as a approaches jrrw from below. Theorem 4.2 is analogous 
to obtaining a multifractal spectrum for this problem. This result is well-suited for comparison 
with experimental data on 3x + 1 iterates. 

This analysis suggest the following prediction, which we state as a conjecture. 
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Figure 4.3: Scaled trajectories of Uk maximizing 7(n) for record values from Table [T] (thin for 
1 < < 10; regular for 11 < A; < 15; thick for 16 < fc < 19), plotted against the predicted 
trajectory. 



Conjecture 4.1 (3x + 1 Scaled Stopping Constant Conjecture) The 3x + 1 scaled stopping 
constant 7 is finite and is given by 



7 = 7RRW ~ 41.677647. 



(4.8) 



The large deviations model does more than predict an extremal value, it also predicts that 
the numbers that approach the extremal value must have a trajectory of iterates whose graph 



have a specified shape, which is a linear function when properly scaled. In Figure 4.3 we graph 
the set of scaled points 



k logr('=)(n)~ 



log n log n 



< k < CFoo{n) 



The predicted large deviations extremal trajectory in this scaling has graph a straight line 



connecting the points (0,1) and {''tRRw^^)- Figure 4.3 shows the scaled trajectories with 



starting seeds taken from Table [T] i.e. those with record values for jooin). Compare to 
Lagarias and Weiss |23t Figure 3]. 



4.3. 3x + 1 RRW Model Prediction: Maximum Excursion Constant 

For the 3x+l RRW Stochastic Model, an appropriate statistic for a single trial that corresponds 
to the maximum excursion value is 

t{n;uj) := sup(e^'='" : A; > 0). 

The 3x + 1 RRW model behavior for extremal behavior of maximum excursions t{n; w) is 
given in the following result [23^ Theorem 2.3]. 
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Theorem 4.3 (3x + 1 RRW Maximum Excursion Constant) For the 3x + l RRW model, with 
probability one the quantities t{n,uj) are finite for every n > 1. In addition, with probability 
one the random quantity 

I ^ V logt(n;a;) / Zk,n\ 

p(uj) := iimsup — = iimsup sup (4-9) 

equals the constant 

PRRW = 2. (4.10) 

Lagarias and Weiss also prove [231 Theorem 2.4] a result permitting a quantitative compar- 
ison with data. 

Theorem 4.4 (3x + 1 RRW Maximum Excursion Density Function) For the 3a; + 1 RRW 

model, for any fixed < a < 1, the expected value 



E 



logt(n;w) 

\\n<x: — ^ >2 — aH 

logn 



(4.11) 



as x ^ oo. 



These theorems suggest formulating the following conjecture. 



Conjecture 4.2 The 3x + 1 maximum excursion constant p defined in (2.14) is finite and is 
given by 

p = 2. (4.12) 

The large deviations model also makes a prediction on the graphs of the trajectories achiev- 
ing maximum excursion, when plotted as the scaled data points 



k iogr('=)( 



n 



log n log n 



< k < CToo(n) 



It asserts that extremal large deviation trajectories should approximate two line segments, 
the first with vertices (0,1) to (7.645,2) and then from this vertex to (21.55,0). The slope 
of the first line segment is flog 3 — log 2 ~ 0.1308 and that of the second line segment is 
(^log|)^^ ~ —0.1453. This prediction is shown as a dotted black line in Figure 
substantial agreement with the empirical evidence. 



4.4 



it shows 



4.4. 3x + 1 RRW Model: Critique 

The 3x -|- 1 repeated random walk model has the feature that random walks for different n are 
independent. However the actual 3x -|- 1 map certainly has a great deal of dependency built 
in, due to the fact that trajectories coalesce under forward iteration. For example, trajectories 
of numbers 8n + 4 and 8n -|- 5 always coalesce after 3 iterations of T. After coalescence, the tra- 
jectories are completely correlated. In fact, the 3x -|- 1 Conjecture predicts that all trajectories 
of positive integers n reach the orbit {1,2} and then cycle, whence they all should coalesce into 
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Figure 4.4: Scaled trajectories of seeds n from Table [3j plotted against the predicted trajectory. 
The trajectory of n = 27 is thin, while the others are thick. 

exactly two classes, namely those that reach 1 in an odd number of iterations of T, and those 
that reach this orbit under an even number of iterations. 



For this reason, it is not apparent a priori whether the prediction in Conjecture 4.1 above 
of the constant 7 = 'Jrrw is reasonable. Our faith in Conjecture |4.1| relies on the fact that 
first, the same prediction is made using a branching random walk model that incorporates de- 
pendency in the model, see Theorem 6.4 in ^ and second, on comparison with empirical data 
in Table [H 



5. 3x + 1 Accelerated Forward Iteration : Brownian Motion 



Now we consider the accelerated 3x + 1 function U. Recall that U is defined on odd inte- 
gers, and removes all powers of 2 in one fell swoop. Iterates of the accelerated function U are 
of course equivalent (from the point of view of the main conjecture) to those of T, but there 
are some subtle differences which make studying both points of view appealing. 

For an odd integer n, we let o(n) count the number of powers of 2 dividing 3n -|- 1, so that 

o(n) := ord2(3n-h 1). (5.1) 
Then the accelerated 3x -|- 1 function U is given by: 

U{n) := (5.2) 



In analogy with the (truncated) parity sequence, cf. Definition 2.1 we make the following 
definition, giving a symbolic dynamics for the accelerated 3x -|- 1 map. 
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Definition 5.1 (i) For an odd integer n, define the o-sequence of n to be 

V{n) := (oi(n), 02(n), 03(n), . . .) (5.3) 

wliere 

Ofc(n) :=o(C/W(n)), 

and U^'^\n) denotes the A:-th iterate of U , as usual. This is an infinite vector of positive integers. 

(ii) For A; > 1 the k-truncated o-sequence of n is: 

yW(n):=(oi(n),02(n),...,Ofc(n)) (5.4) 
i.e. a vector giving the initial segment of k terms of V{n). 

Definition 5.2 For an odd integer n and A; > 1, let the k-size Sfc(n) be the sum of the entries 
m yW(n), that IS 

Sfc(n) := oi(n) + 02(n) H h Ok{n). 

5.1. The Structure Theorem 

Notice that U{n) is not only odd, but is also relatively prime to 3. Hence we lose no generality 
by restricting the domain for U from Z to the (more natural) set 11 of positive integers prime 
to 2 and 3, i.e. 

n := {n G Z : gcd(n, 6) = 1}. (5.5) 

Moreover, 11 is the disjoint union of 11*^^^ and n^^-*, where n*^^-* consists of positive integers con- 
gruent to e (mod 6), e = 1 or 5. 

Definition 5.3 Given e = 1 or 5, /c > 1, and a vector (oi, . . . , o^) of positive integers, let 

s(^)(oi,...,Ofc) 
be the set of all n £ n(^) with yW(n) = (oi, . . . , o^). 



The result analogous to Theorem 2.1 is given by Sinai [35] and Kontorovich-Sinai |18j . 



Theorem 5.1 (Structure Theorem for o Symbolic Dynamics) Fix e = 1 or 5, and let n S II^'^). 

(i) The k-truncated o-sequence V^''\n) of the first k iterates of the accelerated map U{n) is 
periodic in n. Its period is Q -2^ , where 

5 = Sfc(n) = oi(n) + 02(n) H h Ofc(n). 



(ii) For any k > 1 and 5 > k, each of the yk~i) possible vectors (oi, • • • , Ok) with oj > 1 

and oi + • • • + Ofc = s occurs exactly once as yW(n) for some n G U^^^ in the initial segment 
1 < n < 6 • 2^ 

(Hi) The least element no G S(^-'(oi, . . . , Ok) satisfies no < 6 • 2^; moreover 

s(^)(oi,...,Ofc) = (6-2= -m + noj . 

L J m=0 
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Proof. This is proved as part one of the Structure Theorem in Kontorovich-Sinai |18] . Here 
(iii) fohows immediately from (i) and (ii). ■ 

Again one easily shows that an integer n is uniquely determined by the o-sequence V{n) of 
its forward [/-orbit. 

Moreover, the following result shows that the image under the iterated map of 
S(^)(oi, . . . , Ofc) is also a nice arithmetic progression! 

Theorem 5.2 (Iterated Structure Theorem) Fix e = 1 or 5, k > 1, a vector (oi,---,Ofc), and 
let s = oi + ■■■ + Ok- Suppose 1 < no < 6 • 2^ is the least element of S*^'^^(oi, . . . , Ofc). Then there 
is a 6k = 1 or 5 and an r^ G {0, 1, 2, . . . , S'^ — 1}, both depending only on e and (oi, . . . , o^), 
such that, for each positive integer m, 

U^^\q • 2' • m + no) = • m + r^) + 5k. (5.6) 

Moreover, 6k is determined by the congruence 

6k = 2°'= {mod 3). (5.7) 



Proof. This is part two of the Structure Theorem in Kontorovich-Sinai [18^ . Note that m is 
the same number on both sides of (5.6); this equation says that an arithmetic progression with 
common difference 6 • 2^ mapped under U^^^ to one with common difference 6-3'^. ■ 



5.2. Probability Densities 



We first tweak the notion of natural density defined in (2.6 ) on subsets of the natural numbers, 
by restricting to just elements of our domain 11. For a subset S C 11, let the U-natural density 
be 

n G S : n < t 

n G S : n < t ' ' ' ' 



Dn(S) := lim - 

t^oo t 



lim 



n G n : n < i 



provided that the limit exists. (The factor 3 appears because 11 contains two residue classes 
modulo 6.) 

For a vector (oi, . . . , o^), let 

S(oi,...,Ofc) :=S(i)(oi,...,Ofc) U S(5)(oi,...,Ofe). 
Recall that a random variable X is geometrically distributed with parameter < p < 1 if 
¥[X = m] = p'^-\l - p) for m= 1,2,3, ... 
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Theorem 5.3 (Geometric Distribution) 

(1) The sets S(oi, . . . , o^) have a li-natural density given by 



Dn (S(oi,...,Ofc)) 



•2" 



(5.8) 



(2) This natural density matches the probability density of the distribution for independent 
geometrically distributed random variables (pi, ■..,pk) with parameter p= \, which have 



That is, 



He := E[pj] = 2, and Go := yar[pj] = 2. 
P[(pi = 01, . . . ,pA; = Ok)] = Dn (S(oi, . . . , Ok)) ■ 



(5.9) 



Proof. (1) The existence of a natural density is automatic, since these sets are finite unions 



of arithmetic progressions. For e = 1 or 5, we easily compute from Theorem 5.1 that 

_ 1 

6 • 2" ~ 



Dn (s(^)(oi,...,Ofe))=3.— ^ 



-Ok 



and hence (5.8) follows. 



(2) The identity (5.9) is immediate from independence and (5.8). ■ 

We now deduce the following result. 

Theorem 5.4 (Central Limit Theorem) For the accelerated 3x + 1 map U , with symbolic 
iterates (oi, 02, • . .), the scaled ordinates satisfy 



hm Dn 



oi(n) H h Ofc(n) - /i^A; 

n : < a 



1 

/2tt J- 



Proof. This follows immediately from the argument above and the Central Limit Theorem 
for geometrically distributed random variables. ■ 



Compare the above to Theorem 



3.2 



is shared by Borovkov-Pfeifer; see (3.6 



The rate of convergence is again quite slow (this feature 
I)- 



5.3. Brownian Motion 

Consider some starting value xq = n € H, denote its iterates by Xk '■= U^''\xo), and take 



logarithms yk '■= log Xk- As in (3.3), the multiplicative behavior of U is converted via logarithms 



to an additive behavior. Normalize the above by 



yk-yo-klog{l) 
■= 7=-; — —■ (5.10) 



2/clog2 

Then we have the following scaling limits for "random" accelerated trajectories, chosen in 
the sense of density. 
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Figure 5.5: A sample path of the 3x + 1 map. Here we took the starting value xq 
123456 789 135 791 113 151 719, computed 150 iterates of U, and plotted ujk- 



Theorem 5.5 (Geometric Brownian Motion Increments) Fix a partition of the interval [0,1] 
as = to < ti < ■ ■ ■ < tf = 1 . Given an integer k, set kj = [tjk\, with j = 1, . . . , r. Then fi 
any aj < bj, 



or 



Dn xo : aj < ujk^ - tOkj.i < bj, for all j = 1,2, ... ,r 



l[U{b,)-Ha,) 



ask ^ oo, where recall that ^{a) is the cumulative distribution function for the standard normal 
distribution: 

$(a) = ^ r e""'/2^tx. 
\/2tt J— oo 



Proof. This appears as Theorem 5 in Kontorovich-Sinai [18j. See Figure 5.5 



The interpretation of the above result is that the paths of the accelerated 3x + l map, when 
properly scaled, approach those of a geometric Brownian motion, that is, a stochastic process 
whose logarithm is a Brownian motion, or a Weiner process. 



Remark. There are in fact two limits taken in the above theorem, whose order is highly 
non-interchangeable! The first limit is hidden inside the definition of density, that is, first we 
take the limit as x — > oo of the set of all xo < x satisfying the given condition with the number 
k of iterates of U fixed, and only then do we let k oo. If xq were to be fixed and k allowed to 
grow, there would be nothing stochastic at all happening, since we believe the 3a; + 1 Conjecture! 



Remark. The drift, as given in (5.10), is log( 



-0.28768. Compare this to (3.5), where 



the drift of the Biased Random Walk model is computed to be |log(|) ~ —0.14384. While it 
is not surprising that the accelerated map U should have a more aggressive pull to the origin, 
it is curious that it is exactly twice as fast (on an exponential scale) as the 3x + 1 function T. 
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Remark. Given that the drift of the (logarithm of the) accelerated 3x + 1 function U is 
// = log(|), one expects that the typical total stopping time of a seed n is roughly 



- — - log n 3.476 log n. 

ImI 



5.4. Entropy 



Definition 5.4 The entropy of a random variable X taking values in [M] := {1, 2, . . . , M} is 
given by 

M 

H := V[X = m] logP[X = m]. 

m=l 

The following facts are classical: 

(i) If X is distributed uniformly in [M] then H = log M. 

(ii) The entropy H is maximized by the uniform distribution. 

The first is an elementary exercise, while the second is proved easily using, e.g., Lagrange's 
multiplier method. 



In light of Theorem 5.2 for any fixed A; > 1, the value < < 3^^ — 1 is a function of the 
values £ and (oi, . . . , Ofc), and hence has a natural density. For a fixed r G [0, 3^ — 1] we write 

"^lAxQ : rk{xQ) = x\ to mean ^ Dnp^^-'Coi, . . . , o^)]. 

(oi,...,ofc), ££{1,5} 
rj.(£,oi,...,oj.) = t 

One might hope that (which is a deterministic function but can be thought of as a "random 
variable") is close to being uniformly distributed in {0, 1, . . . , 3'^ — 1}; then one could attempt 
to "bootstrap" iterations of U to one-another to have better quantitative control on various 
asymptotic densities with k not too large. Were this to be the case, the entropy (defined for 
this using Dn in place of P) would be log 3'^ = A; log 3. 

Theorem 5.6 (Entropy of r^) There is some constant c > such that the entropy H of r^ 
satisfies: 

H > A: log 3 — clogk. 
Proof. This statement is Theorem 5.1 in Sinai [35]. ■ 



satisfies, cf. (5.7) 



The function in Theorem 5.2 is accompanied by the residue class 5^ G {1)5}, which 



6k = 2'''^(mod3). 

It follows immediately from the fact that is geometrically distributed with parameter 1/2, 
that ^ 

Du[xo : Sk{xo) = 1] = Dn[xo : is even ] = -, 
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and hence of course, Dn[2;o : Sk{xo) = 5] = |. 

Moreover, if is uniformly distributed, then so are the digits /ifc(j) G {0, 1, 2} in its 3-adic 
expansion: 

Vk = hk{k - 1) • S'^-i + hk{k - 2) • 3'=-2 + . . . + hk{l) ■ 3 + /ifc(O). 

Note that only the first few leading digits hk{k — 1), hk{k — 2), ... , hk{k — t) are needed to 
specify that location of rfc/3'^, to within an error of 1/3*. 

Theorem 5.7 (Joint Uniform Distribution) The joint distributions of {rk/3^,Sk) converge 
weakly to the uniform distribution, that is, for any fixed t > 1 and [)i,...,f)t G {0,1,2}, as 

k oo, 

Du[xo : hk{k - 1) = hk{k - 2) = ()2, . . . , hk{k - t) 

and 

Dn[xo : hk{k - 1) = f)i, hk{k - 2) = [)2, . . . , hk{k - t) 
Proof. This appears as Theorem 1 in Sinai [36]. See also |37j . ■ 

6. 3a; + 1 Backwards Iteration: 3x + 1 Trees 

One can also model backwards iteration of the 3x + 1 map T{x). 

Backwards iteration is described by a tree of inverse iterates, and there are either one or 
two inverse iterates. Here 

f {2n} if n = 0,l (mod 3), 

r(-i)(n) = \ 

[ {2n,^} if n = 2(mod 3). 

Starting from a root node labelled a we can grow an infinite tree T(a) of all the inverse iterates 
of a. Each node in the tree is labelled by its associated 3x + 1 function value. To a node labelled 
n we add either one or two (directed) edges from the elements of T~^(n) to n, and we label 
these two edges by the value of this element. 

6.1. Pruned 3x + 1 Trees 

Next we note that any a = (mod 3) has exactly one inverse iterate, which itself is (mod 3). 
Thus if a = (mod 3) the set of inverse iterates forms a single branch that never divides. 
However if a ^ (mod 3) then the tree grows exponentially in size. It is convenient therefore 
to restrict to numbers a ^ ( mod 3) and furthermore to prune such a tree to remove all nodes 



i)t,6kixo) = 1] 



i i 
3* ■ 3' 



i)t, hixo) = 5] 



1 2 
3* ' 3' 
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n = (mod 3). This produces an (infinite depth) pruned tree T*{a) which is described by 
inverse iterates of the modified map 



n 



{2n} if n = 1, 4, 5 or 7 (mod 9), 



{2n, if n = 2 or 8 (mod 9), 



(6.1) 



appHed starting with root node labehed tiq := a. The pruning operation is depicted in Figure 



6.6, with root node assigned depth 0. 



16 




1 64 . 20 , ril6 64 , 20 - 
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(i) (ii) r4(4) (iii) r/(4) 

Figure 6.6: 3a; + 1 trees Tk{a) and "pruned" 3x + 1 tree T^{a)^ with /c = 4. 

We obtain a reduced tree T*(a) obtained by labehing each node with the (mod 2) residue 
class of the 3x + 1 value assigned to that node. (One may also think of this as labelling the 
directed edge leaving this node, with the exception of the root node.) 

We let T^{a) denote the pruned tree with root node no = a, cut off at depth k, and we 
let T^(a) denote the same tree, keeping only the node labels (mod 2), for all nodes except the 
root node, where no data is kept. Let N*{k; a) count the number of depth k leaves in this tree. 
Then we have 

N*{k,a) :=\{n: n ^ (mod 3) and T^''\n) = a}\. (6.2) 

We have N*{k,a) < 2'^ as a consequence of the fact that each 3x + 1 tree has at most two 
upward branches at each node. 

The following result gives information on the sizes of depth k trees over all possible tree 
types {fIT, Theorem 3.1]). 



Theorem 6.1 (Structure of Pruned 3a; + 1 Trees) 

(1) For k > 1 and a ^ (mod 3), the structure of the pruned level k tree Tj^{a), and hence 
the number N*{k, a), is completely determined by a (mod 3'^"^"'^). 
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(2) There are 2 ■ S'^ residue classes a (mod 3'^^^) with a ^ (mod 3). For these 



J2 N*{k,a) = 2-4\ (6.3) 

3) 

It follows that if a residue class a (mod 3'^"''^) with a ^ O(mod 3) is picked with the uniform 
distribution, the expected number of leaves in the random tree Tl.{a) is exactly 



a { mod 3*= + !) 
m^O ( mod 3) 



3^ 

We now consider the complete set of numbers having total stopping time k. Set 

Nk:=\{n: aooin) = k}\. (6.4) 



Recall from ^2.5 that Nk = Nk{l), where Nk{a) counts the number of integers that iterate to 
a after exactly k iterations of the 3a; + 1 map T. We defined there the 3a; + 1 tree growth 
constants ^ 

53(a) := limsup -logA''fc(a)• 
fc— >oo 



Theorem 6.1 suggests the following conjecture for these tree growth constants, made by La- 



garias and Weiss [23]. 



Conjecture 6.1 For each a ^ (mod 3), the 3x + 1 tree growth constant S^^a) is given by 

J3(a) = log . (6.5) 

Applegate and Lagarias [2J determined by computer the maximal and minimal number of 
leaves in 3a; + 1 trees of depth k for > 30. The maximal and minimal number of leaves in 
such trees at level k is given by 

N+ := max{iV^(a) : a (mod 3^+^) with a ^ (mod 3)}. 

and 

= min{iV^(a) : a (mod 3*=+^) with a ^ (mod 3)}, 



respectively. Figure 6.7 pictures maximal and minimal trees for depth k = 5. (Circled nodes 



indicate an omitted inverse iterate under T ^ that is = (mod 3). 



The data on these counts N±{k) was presented already in ^2.5 cf. Table |4| Based on 



this data, Applegate and Lagarias jH Conjecture C] formulated the following strengthened 
conjecture, which implies Conjecture |6.1[ 



Conjecture 6.2 The maximal and minimal number of leaves of 3x+l trees satisfy, ask ^ 00, 

/Q\ k+o{k) 

= (4) (6-6) 

/o\ k+o{k) 
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(i) T^{7) attains iV-(5) = 2 




(ii) T5*(20) attains 7V+(5) = 8 



Figure 6.7: Maximal and Minimal depth 5 pruned 3x + 1 Trees 



6.2. 3x + 1 Backwards Stochastic Models: Branching Random Walks 



Lagarias and Weiss [23] formulated stochastic models for the growth of 3x + 1 trees that were 
multi-type branching processes. Such models grow a random tree, with nodes marked as several 
different kinds of individuals. In this case the number of nodes of each type at each depth k 
(also called generation k) can be viewed as the output of the branching process. The particular 
branching processes they used are multi-type Galton-Watson processes. 

Lagarias and Weiss also modeled the size of preimages of elements in a (pruned) 3x + 1 
tree. This size is specified by a real number attached to each node. Branching process models 
which attach to each node in the tree a real number giving the position of those individuals on a 
line, according to some (possibly random) rule, are models called multi-type branching random 
walks. Here the location of the individuals on the line give the random walk aspect; offspring 
nodes at level k are shifted in position from their parent ancestor at level k — 1 by a point 
process. The process starts with a root node giving a single progenitor at level (generation 
0). 

Lagarias and Weiss defined a hierarchy of branching random walk models, which they de- 
noted i3[3''], for each j > 0. These branching random walk models, having several kinds of 
individuals, model the backwards iteration viewed (mod 3-'). The model for j = is simpler 
than the other models. 

32; + 1 Branching Random walk B[3^]. There is one type of individual. With probability 
I an individual has a single offspring located at a position shifted by log 2 on the line from its 
progenitor, and with probability | it has two offspring located at positions shifted log 2 and 
log I on the line from their progenitor. If the progenitor is in generation (or depth) k — 1, the 
offspring are in generation k. The tree is grown from a single individual at generation 0, with 
specified location log a. 



The more general models for j >1 are given as follows. 
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3x + 1 Branching Random walk ^[3-'], {j > 1). There are p = 2 ■ 3^~^ types of mdividuals, 
indexed by residue classes a (mod 3-') with a ^ (mod 3). The distribution of offspring of an 
individual of type a (mod 3-^), at any given depth k in the branching, is determined as follows: 
Regard a (mod 3-') labelling a node at depth d—1. Regard it as being, with probability | each, 
one of the three possible residue classes a (mod 3-'^^) consistent with it. The tree (of depth 1) 
with a as root node, given by (T*)~^(a) has either one or two progeny, at depth 1 and their 
node labels are well-defined classes (mod 3-^), either 2a or, if it legally occurs, ^^p^(mod 3-^). 
The branching random walk then produces an individual of type 2a at generation k + 1 whose 
position is additively shifted by log 2 from that of the generation k progenitor node, plus, if 
legal, another labelled which is shifted in position by log(|) on the line from that of the 

generation A;-node. The tree is grown from a single individual at depth 0, with specified type 
and location log a. 

In these models, the behavior of the random walk part of the model can be completely 
reconstructed from knowing the type of each node. This is a very special property of these 
branching random walk models, which does not hold for general branching random walks. 

In such models, one may think of the nodes as representing individuals, with individuals at 
level k being children of a particular individual at level k — 1; the random walk aspect indicates 
position in space of these individuals. 

Let UJ denote a single realization of such a branching random walk B[3^ which starts from 
a single individual cuq^i of type 1 (mod 3-') at depth 0, with initial position labeled logo. Here 
UJ describes a particular infinite tree. We let Ni^-^uj) denote the number of individuals at level k 
of the tree. We let S{uJk,j) denote the position of the j-th individual at level k in the tree, for 
l<j<Nk{uj). 

These models are all supercritical branching processes in the following very strong sense. In 
every random realization uj, the number of nodes at level d grows exponentially in d, and there 
are no extinction events. 

Lagarias and Weiss [23] observed that the predictions of these models stabilized for all 
J > 1, as far as the behavior of asymptotic statistics related to 3x + 1 trees is concerned. This 
is illustrated in the following theorems. 

6.3. 3x + 1 Backwards Model Prediction: Tree Sizes 

Concerning the number of nodes Nf^^uj) in a realized tree at depth k, Lagarias and Weiss proved 
the following result [23i, Corollary 3.1]. 



Theorem 6.2 (3x + 1 Stochastic Tree Size) For all j > 0, a realization uj of a tree grown in 
the 3x + 1 branching random walk model B[3^] has 

1 /4\ 
lim — (log Nk(uj)) = log - , for almost every uj. (6-8) 
fc->oo k \3 J 
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This result only uses the Galton- Watson structure built into the process ;B[3-']. Its prediction 
is consistent with the rigourous results on average tree size for pruned 3x + 1 trees given in 



Theorem 6.1, and it also supports Conjecture 6.1 



6.4. 3x + 1 Backwards Model Prediction: Extremal Total Stopping Times 

Next, as a statistic that corresponds to an extremal trajectory, consider the first birth in gener- 
ation k, which is the leftmost individual on the line at depth k in the branching random walk. 
Denote the location of this individual by L^(a;), for a given realization u; of the random walk. 
Lagarias and Weiss j23l Theorem 3.4] proved the following result. 

Theorem 6.3 (Asymptotic First Birth Location) For any 3a;+l branching random walk model 
B[3^] with j > 2, there is a constant (3bp such that for all j > 0, the branching random walk 
B[3^] has asymptotic first birth (leftmost birth) 

lim LKuj) = I3bp for almost every uj. (6.9) 

fc— >oo 

This constant (3bp ^ 0.02399 is determined uniquely by the properties that it is the unique 
/3 > that satisfies 

m = (6.10) 

where 

g{a) := -supg<o [aO - log Mbp(^)) . (6.11) 
Here Mbp{9) is the branching process moment generating function 

MBp{e):=2'+^-C-f. (6.12) 

Since the first birth individual at depth k corresponds to taking k iterations to reach the 
root node, we can define a branching process scaled stopping limit 'yBp{uj). This is the BP 



model's prediction for the scaled stopping constant 7 from (2.10), defined by 

k 

1BP{^) :=limsup— — . 

fc-^oo ^k\^) 



Theorem 6.3 implies that this value is constant (almost surely independent of w), and takes the 
value 

IBP = {Pbp)-^- (6.13) 

Note that since I3bp ~ 0.02399, we have 1/(3bp ~ 41.7. At this point we have two completely 
different predictions for the scaled stopping constant 7, one from the RRW model (cf. Theorem 



4.1) which approximates forward iterations, and another from the BP models which estimate 
backwards iterations. Applegate and Lagarias then prove |23l Theorem 4.1] the following strik- 
ing identity. 

Theorem 6.4 (3x + 1 Random Walk-Branching Random Walk Duality) The 3x -|- 1 repeated 
random walk (RRW) stochastic model scaled stopping time limit ^rrw 0'''^d the 3x-|-l branching 
random walk (BP) model B\^^] with j > 0, scaled stopping time limit ^bp are identical! I.e., 

IRRW = IBP- (6.14) 
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Proof. This is a consequence [23] of an identity relating the moment generating functions 
associated to the two models, which is Mbp{0) = Mrjiw^O + 1); compare (4.4) and (6.12). ■ 



Remark. Recall the critique of the RRW model given in ^ 4.4 , that various trajectories coalesce 
in their forward iterates. But the BP models, by their tree construction, completely take into 
account the dependence caused by coalescing trajectories! Since both models predict the same 
exact value for 7, it appears the critique has been thwarted off. 



6.5. 3x + 1 Backwards Model Prediction: Total Preimage Counts 

We next consider what the branching process models have to say about the number of integers 
below X that eventually iterate to a given integer a. 

The following result gives, for the simplest branching random walk model, an almost sure 
asymptotic of the number of inverse iterates of size below a given bound ( [23^ Theorem 4.2]). 

Theorem 6.5 (Stochastic Inverse Iterate Counts) For a realization oo of the branching random 
walk B[l], let I*{t;uj) count the number of progeny located at positions S{u)kj) < x, i.e. 

r{x]uj) := #{cjfcj : S{uJkj) < xjor any k>l, 1 < j < Nk{uj)}. (6.15) 

Then the asymptotic estimate 

r(x;u;) = as x^ 00 (6.16) 

holds almost surely. 

The model statistic I*{x; uj) functions as a proxy for the function 7ra(x), where log a gives the 
position of the root node of the branching random walk. This result is the stochastic analogue 
of Conjecture 2.1 about the 3x + 1 growth exponent. 



7. The 5a; + 1 Function: Symbolic Dynamics and Orbit Statistics 

We now turn for comparison to the 5a; + 1 iteration. Some features of the dynamics of this 
iteration are similar to that of the 3x + 1 problem, and some are different. Here the dynamics of 
iteration in the long run are expected to be quite different globally from the 3a;+l problem; most 
trajectories are expected to diverge. In this section we formulate several orbit statistics for this 
map, some the same as for the 3a; + 1 map, and some changed. We review basic results on them. 



7.1. 5x + 1 Forward Iteration: Symbolic Dynamics 

The basic features of the 5a; + 1 problem are similar to the 3x + 1 problem. We introduce the 
parity sequence 

S^{n) := (n (mod 2),T5(n) (mod 2),rf^(n) (mod 2),...). (7.1) 

The symbolic dynamics is similar to the 3a; + 1 map: all finite initial symbol sequences of length 
k occur, each one for a single residue class (mod 2^). 



39 



Theorem 7.1 (5x + 1 Parity Sequence Symbolic Dynamics) The k-truncated parity sequence 
S^\n) of the first k iterates of the 5x + 1 map T{x) is periodic in n with period 2^. Each of 
the 2^ possible — 1 vectors occurs exactly once in the initial segment 1 < n <2^ . 



Proof. The proof of this result exactly parallels that of Theorem 2.1 ■ 
As before, the parity sequence of an orbit of xq uniquely determines xq. 
Analysis of this recursion, assuming even and odd iterates are equally likely, as prescribed 



by Theorem 7.1, we find the logarithms of iterates grow in size on the average. 



7.2. 5x + 1 Forward Iteration: A+- Stopping Times 

Most 5x + 1 iteration sequences grow on average, rather than shrinking on average. An appro- 
priate notion of stopping time for this situation is as follows. 

Definition 7.1 For fixed A > 1, the X'^ -stopping time (7^{n) of a map T5 : Z ^ Z for input n 

(k) 

is the minimal value of A; > such that n > An, e.g. 

rp{k) / \ 

a+(n) := inf{A: > : > A}. (7.2) 

If no such value k exists, we set o"^(n) = +00. 



One now has the following result, which parallels Theorem 2.2 for the 3x + 1 map, except 
that here iterates grow in size rather than shrink in size. 

Theorem 7.2 (A+-Stopping Time Natural Density) 

(i) For the 5x + 1 map T^{n), and fixed A > 1 and k > 1, the set S^{k) of integers having 
-stopping time at most k has a well-defined natural density D(S'^(A;)). 



(a) This natural density satisfies 
In particular, the set of numbers with finite -stopping time has natural density 1. 



lim D(5|(A:)) = 1. (7.3) 

A; —+00 



Proof. Claim (i) follows using the Parity Sequence Theorem 7.1 Here the set is a finite union 
of arithmetic progressions (mod 2*^), except a finite number of initial elements may be omitted 
from each such progression. 

The result (ii) can be established by a similar argument to that used for the 2>x + \ problem 
in Theorem 12. 2[ ■ 



Here we note a surprise: there are infinitely many exceptional integers n that have A"*"- 
stopping time equal to +00! This occurs because the 5x + 1 problem has a periodic orbit 
{1,3,8,4,2}, and infinitely many positive seeds ng eventually enter this orbit, e.g. uq — ~^ 
for any k > 2. All of these integers have (T^(no) = +00. Nevertheless Theorem 
such integers have natural density zero. 



7.2 



5 

asserts that 
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7.3. 5s + 1 Stopping Time Statistics: Total Stopping Times 

The 5x + 1 problem has a finite orbit containing 1, and we may define total stopping time as 
for the 3x + 1 function. 

Definition 7.2 For n > 1 the total stopping time aoo{n; T5) of the 5x + 1 function is given by 

Too{n; n) := inf{A: > 1 : T^^\n) = 1}. (7.4) 

We set aoo{n; T5) = +00 if no finite k has this property. 

Here we expect that the vast majority of positive n will belong to divergent trajectories, 
and only a small minority of n have a well-defined finite value o"oo(n.;T5) < 00. It is an open 
problem to prove that even a single trajectory (such as that emanating from the starting seed 
no = 7) is divergent! 



The best we can currently show unconditionally is a lower bound on the size of the extremal 
total stopping time that grows proportionally to logn. 

Theorem 7.3 (Lower Bound for 5x + 1 Total Stopping Times) There are infinitely many n 
whose total stopping time satisfies 

<yoo{n,n) > C-°^^^^^] logn ^ 4.79253 log n. (7.5) 
V (log 2)^ J 



Proof. The Parity Sequence Theorem 6.2 implies there is at least one odd number with 
^ ^ nk < 2^ whose first k — I iterates are also odd, so that T^\nk) > {^)^nk. Since a single 
step can divide by at most 2, we necessarily have (using logn^ < /clog 2), 

(Too{nk.n) ^ k ^ ( /clogf +lognfc \ 1 ^ ^ \og\ _ ^ ^^^^^ 
lognfc ~ log rifc y log 2 j logn^ ~ log 2 (log 2)2 

We do not know if these numbers have a finite total stopping time. ■ 

The methods of Applegate and Lagarias \^ for 3a; + 1 trees can potentially be applied to 
this problem, to further improve this lower bound, and to establish it for numbers n having a 
finite total stopping time. 



An interesting challenge is whether one can show for each c > that only a density zero 



set of n have 
trajectories having (t^[ 



< c. A stochastic model in ^8.9 
n) > 85 log n will necessarily have a. 



predicts that all but finitely many 
(n) = +00, so establishing this for 



c = 85 would be consistent with the prediction that only a density zero set of n have 1 in their 
forward orbit under T5. 
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7.4. 5x + 1 Size Statistics: Minimum Excursion Values 

In the topsy-turvy world of the 5x + 1 problem, since most trajectories get large, our substitute 
for the maximum excursion constant is the following reversed notion. 

Definition 7.3 For an integer n the minimal excursion value t~{n) of the 5x + 1 function is 
given by 

t-{n) := inf{|T5^''^(n)| : > 0}. (7.6) 
We have i~(0) = 0, while infinitely many n will have minimum excursion value equal to 1. 

Definition 7.4 For n > 1 the minimal excursion constant {n) of the 5x + 1 function is 
given by 

(n) := limmi — ; . (7.7) 

'-a logn 

We now immediately have the following result. 

Theorem 7.4 The Sx + 1 minimum excursion constant is given by 

P5 = 0. (7.8) 

Proof. The inverse orbit of n = 1 for contains {2^ : j > 1}, whence t~ {2^) = 1. ■ 

We state this easy result as a theorem, because it has the remarkable feature, among all 
the constants associated to these 3x + 1 and 5x + 1 maps, of being unconditionally proved! It 
also has the interesting feature that the stochastic models below make an incorrect prediction 



in this case, cf. Theorem 8.4 



7.5. 5x + 1 Count Statistics: 5x + 1 Tree Sizes 

In considering backwards iteration of the 5x + 1 function, we can ask: given an integer a how 
many numbers n iterate forward to a after exactly k iterations, that is, T^^\n) = a? 

The set of backwards iterates of a given number a can again be pictured as a tree; we call 
these 5x + 1 trees. Now Nk{a) counts the number of leaves at depth k of the tree with root 
node a, and counts the number of leaves in a pruned 5x + 1 tree, which is one from which 
all nodes with label n = (mod 5) have been removed. The definitions are as follows. 

Definition 7.5 (1) Let N^^a; T5) count the number of integers that forward iterate under the 
5x + 1 map T^^n) to a after exactly k iterations, i.e. 

Nk{a;T5) := \{n : T^''\n) = a}\. (7.9) 

(2) Let N^{a; T5) count the number of integers not divisible by 5 that forward iterate under 
the 5a; + 1 map T^{n) to a after exactly k iterations, i.e. 

NUa;n) := \{n : Tf ^(n) = a, n ^ O(mod 5)}|. (7.10) 
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The case a = 1 is of particular interest, since the quantities then count integers that iterate 
to 1, and in this case we let 

Nk,5:=Nk{l;n), iV^* 5 := iV,*(l; Tg). 

Definition 7.6 (1) For a given a the 5x + 1 tree growth constant 5^{a) for a is given by 

5^{a) :=limsupi(logA^fc(a;T5)). (7.11) 

(2) The 5x + 1 tree growth constant 5^ = 55(1). 



The constant ^5(0) exists and is finite, as follows from the same upper bound as in (2.17). 



The following result gives information on the sizes of depth k pruned 5x + 1 trees over all 
possible tree types. 



Theorem 7.5 (Structure of Pruned bx + 1 Trees) 

(1) For k>\ and a ^ O(mod 5), the structure of the pruned level k tree Tj^{a), and hence 
the number {a; T^), is completely determined by a (mod b'^'^^). 

(2) There are 4 • 5*^ residue classes a (mod 5'^^^) with a ^ (mod 5). For these 



J2 Nl{a-n)=A-QK (7.12) 



a ( mod 5*= + !) 
a^O ( mod 5) 



It follows that if a residue class a (mod b^^^) with a ^ O(mod 5) is picked with the uniform 
distribution, the expected number of leaves in the random tree T\[a) is exactly . 

Proof. This result is shown by a method exactly similar to the 3x + 1 tree case ( \22>\ Theorem 
3.1]). We omit details. ■ 



Theorem 7.5 suggests the following conjecture. 



Conjecture 7.1 For each a ^ (mod 5), the 5x + 1 tree growth constant 5^{a) is given by 

55(a) = log (7.13) 



Compare this conjecture with the prediction of Theorem 8.7 
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7.6. 5x + 1 Count Statistics: Total Inverse Iterate Counts 



In considering backwards iteration of the 5x + 1 function from an integer a, the complete data 
is the set of integers that contain a in their forward orbit. The following function describes this 
set. 

Definition 7.7 Given an integer a, the inverse iterate counting function TTa,5{x) counts the 
number of integers n with \n\ < x that contain a in their forward orbit under the 3x + l function. 
That is 

''^a,5{x) := #{n : |n| < x such that some T^^\n) = a, k > 0}. (7.14) 

The inverse tree methods for the 3x + 1 problem carry over to the 5a; + 1 problem, so that 
one can obtain a result qualitatively of the following type, by similar proofs. 

Theorem 7.6 (Inverse Iterate Lower Bound) There is a positive constant ci such that the 
following holds. For each a ^ (mod 5), there is some XQ^a) such that for all x > xo(a), 

^a,5{x)>x'\ (7.15) 

The following statistics measure the size of the inverse iterate set in the sense of fractional 
dimension. 

Definition 7.8 Given an integer a, the upper and lower 5x + 1 growth exponents for a are 
given by 

+ / N V log7ra,5(x) 

r^tia) :=limsup— — — , 

x^co iOg X 

and 

rj^ [a) := hmmi ■ 



x^oo 



log X 



If these quantities are equal, we define the 5a; + 1 growth exponent rj^{a) to be ri^{a) = ri'^{a) = 

In parallel with conjectures for the 3x + 1 map, we formulate the following conjecture. 

Conjecture 7.2 (5x + 1 Growth Exponent Conjecture) For all integers a ^ (mod 5), the 
3x + l growth exponent ri^{a) exists, and takes a constant value 775 independent of a. This value 
satisfies 

V5 < 1. (7.16) 

The stochastic models discussed in f|8] suggest that the constant r/5 exists and takes a value 
strictly smaller than 1. There is some controversy concerning the conjectured value of the 
constant. In ^we present a repeated random walk model and a branching random walk model 
that both suggest the value % ~ 0.649. A different branching random walk model formulated 
by Volkov [40j suggests the value 775 « 0.678. Lower bounds toward this conjecture can be 
rigorously established, cf. Theorem 7.6 above. We have not bothered to determine ci in (7.15), 
though we suspect it is well below either of the above predictions, and hence cannot distinguish 
between them. 
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8. 5a; + 1 Function: Stochastic Models and Results 



We now discuss stochastic models for the 5x + 1 problem paralleling those for the 3x + 1 
problem. These include random walk models for forward iteration of the 5x + 1 map, analysis 
of the accelerated 5x + 1 map, and branching random walks for the backwards iteration of the 
5x + 1 map. 

8.1. 5x + 1 Forward Iteration: Multiplicative Random Product Model 

Concerning forward iteration, we may formulate a multiplicative random product model parallel 
to that in ^ Consider the random products 

Yk ■= X1X2 ■ ■ ■ Xk, 

in which the Xi are each independent identically distributed (i.i.d.) random variables Xi having 
the discrete distribution 

5 



Xi 



with probability |, 



with probability 



2 

We call this the 5x + 1 multiplicative random product (MRP) model. 



As before, this model does not include the choice of starting value of the iteration, which 



would correspond to Xq; the random variable really models the ratio ^ y — -. We define for 



A"*" > 1 the -stopping time random variable 

Vy^{u) ■= ini{k ■ Yk > A}. (8.1) 

where io = (Xi, X2, X^, ■ ■ ■) denotes a sequence of random variables as above. This random 
vector uj models the change in size of a random starting value n = Xq that occurs on iteration 
of the 5x + 1 map. 

This stochastic model can be used to exactly account for the density of A''~-stopping times, 
as follows. 

Theorem 8.1 (A^-Stopping Time Density Formula) For the 5x + 1 map T^^n) and any fixed 
A > 1, the natural density B{Sx{k)) for integers having X'^ -stopping time at most k is given 
exactly by the formula 

D{S+{k))=V[V+{u)<k], (8.2) 

in which is the -stopping time random variable in the multiplicative random product 
(MRP) model. 

Proof. This follows by a parallel argument to that in Borovkov and Pfeifer |10| Theorem 3] 
for the 3x + 1 problem. ■ 



Theorem 8.1 is the stochastic model parallel of Theorem 7.2 
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8.2. 5x + 1 Forward Iteration: Additive Random Walk Model 

We next formulate additive random walk models, obtained after logarithmic rescaling of the 
5x + 1 iteration. The 5,x + 1 iteration takes xq = n and Xk = {n). Using a logarithmic 
rescaling with y^. = log x^ (natural logarithm) we have 

yk = \ogXk :=logr(*^)(n). 

Then we have 



Vk+i = \ 

with 



yfe + log| + efc ifx = l (mod 2), 

(8.3) 

2/fe + log| if X = (mod 2), 

ek := log f 1 + -^"j . (8.4) 



bxkJ 



Here is small as long as is large. 



We approximate the deterministic process above with the following random walk model 
with unequal size steps. We take random variables 

Wk := -log2 + (^fclog5, 

in which 5}^ are i.i.d. Bernoulli random variables. The random walk positions {Z^^ : k > 0}, are 
then random variables having starting value Zq = logm, for some fixed initial condition m > 1, 
and with 

Zk = ZQ + Wi + W2 + --- + Wk. 
The Zk define a biased random walk, whose expected drift n is given by 

li ■= E[Wk)] = - log 2 + ^ log 5 = ^ log {^^ « 0.11157. 

The variance a of each step is given by 

a ■= Var[PFfe] = ^ log 5 0.80472. 

Call this random walk the 5a; + 1 Biased Random Walk Model (5x + 1 BRW Model). 

Since the mean of this random walk is positive, this biased random walk has a positive drift. 
This positive drift implies that a random trajectory diverges with probability one. 

Theorem 8.2 For the 5x + 1 BRW model, with probability one, a trajectory {Z^ : > 0} 
diverges to +oo. 
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Proof. This is an elementary fact about random walks with positive drift. ■ 

This result implies that a generic trajectory has total stopping time equal to +cxd. That 
is, starting from Zq = logn, the probability P[£'n] of the event En that for some A; > 1, the 
total stopping time condition Zk < is satisfied, is strictly smaller than 1, i.e., P[-E'„] < 1. It is 



positive but decreases to as n increases to +oo. (To not confuse this fact with Theorem 8.2 
even if Z^ dips below 0, it charges back up to infinity, almost surely.) 

To obtain a result parallel to those of ^ on the average behavior of numbers n having a 
finite total stopping time, one needs to condition on the set of n that have a finite total stopping 
time. This appears an approachable problem, but requires a more complicated analysis than 
that given in ^23j or Borovkov and Pfeifer [10|. 



8.3. 5x + 1 Forward Iteration: Repeated Random Walk Model 

Next, paralleling ^ we formulate a 5x + 1 Repeated Random Walk (RRW) model as follows. 
A model trial is the countable set of random variables 

oj := {Zk,n- k>0,n>l}, (8.5) 

having initial condition Zo,n = log n, with the individual random walks being 5x + 1 biased 
random walks, as above. In the following subsections we consider other predictions that RRW 
model makes for various statistics. 

Theorem 8.3 For the 5x + 1 RRW model, with probability one, for every n > 1 the trajectory 
{Zk^n '■ k>{)} diverges to +oo. 



Proof. This follows immediately from Theorem 8.2 since the complement of this event is a 
countable union of measure zero events. ■ 

One might misinterpret the above as suggesting that the 5x + l RRW model predicts that all 
trajectories are unbounded. Of course this is an incorrect prediction. The 5x + 1 iteration has 
some finite cycles, and furthermore there are infinite number of integers that eventually enter 
one of these cycles. The stochastic model above cannot account for such bounded trajectories! 
Instead we interpret the stochastic model prediction to be that a density one set of integers lie 
on unbounded trajectories. 

This should make you very worried about relying on stochastic models to predict that 3x + 1 
trajectories decay! There could potentially be a set of measure zero escaping to infinity, which 
the model simply cannot see. Such a pathological trajectory is the heart and soul of the 3x + 1 
problem, and root cause of its difficulty! 
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8.4. Sx + 1 RRW Model Prediction: Minimum Excursion Constant 

The 5x + 1 RRW model has the following analogues of minimal excursion values and of the 
minimum excursion constant. 

Definition 8.1 For a realization uj = {Z^^n : A; > 0,n > 1} of the 5x + 1 RRW model, the 
minimal excursion value t^{n,uj) is given, for each n > 1, by 

t~{n,uj) := ■m.i{e^^'^ : A; > 0}. (8.6) 



Theorem 8.3 implies that with probability one the value t {n,uj) is well-defined and strictly 
positive. 

Definition 8.2 For a realization uj of the 5a; + 1 RRW model, the minimum excursion constant 
P5 (^) is given by 

p,-H:=liminfi^^^P^. (8.7) 

Now a large deviations analysis yields the following result. 

Theorem 8.4 (5x + 1 RRW Minimum Excursion Constant) For the 5x + l RRW model, with 
probability one the quantities t~{n,Lo) are finite for every n > 1. In addition, with probability 
one the random quantity 

f \ r ■ f^ogt-{n;uj) ■ r( ■ r ^fe,n 

RRw(ct') := limmf ^ = limmf mf - — — 

o.rtrtw V ji->oo logn n^oo \fc>ologn 

equals the constant 

PIrrw = 1 - ^ ^ -1.86466, (8.9) 

in which 9* ~ 0.3490813 is the larger of the two real roots of the equation M^^nfm/^O) = 1, 
where M5^jijiw{6) := 1 ^2^ + (|)^^ is a moment generating function associated to the random 
walk. 



Proof. This is proved by a large deviations argument similar to that used for the maximum 
excursion constant for the 3x + 1 problem in Lagarias and Weiss |23| Theorem 2.3]. We sketch 
the main computation. We estimate the probability P{r, H, x) on a single trial starting at logx 
of having 

— Zr\ogx,\ogx ^ HlogX. 

We define a by the condition H = ar and find that the probability is given by Chernoff 's bound 
as 

P(r, H, x) = exp {-g5,RRw{a)r log x(l + o(l)) , 

in which 

95,RRw{a) := sup {a6 - log M^^RRwiO)) (8.10) 

is a large deviations rate function, which is the Legendre transform of the logarithm of the 
moment generating function M^^Rjiwif^) = 5 (2^ + |)^)- The repeated random walk makes x 
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trials 1 < n < X so the probability of a success over these trials is xP{r, H, x), and we want this 
to be at least x*^, so that a success occurs infinitely often as x ^ oo. (We also will let e — > 0, 
so we set it equal to zero in what follows.) We want therefore to maximize H = ar subject 
to the constraint that g5,RRw{0')i^ ^ 1- To maximize we may take g{a)r = 1, whence r = 
can be used to eliminate the variable r. We now have the maximization problem to maximize 

H := ° / s over < a < oo. One finds an extremality condition for maximization which 

yields 

TT* ^ 

where a* achieves the maximum, and 9* is the corresponding value in the Legendre transform. 
Uniqueness of the maximum follows from convexity properties of the function logMjifmr(^9). 
Detailed error estimates are also needed to verify that this the maximum gives the dominant 
contribution. ■ 



This constant rrw found in Theorem 



8.4 



model reaches a real number much smaller than 

disagrees with the exact answer for minimum excursion constant for the 5x + l problem 



is negative, i.e. the minimum excursion in the 
1! As a prediction for the 5x + 1 problem, this 





given in Theorem 7.4 



We view this inaccurate prediction as stemming from the discrepancy that the 5x + 1 
function takes only values on the integer lattice, and that its additive correction term is not 
accounted for in this stochastic model. That is, the stochastic model will not necessarily make 
good predictions on behavior of an orbit once an orbit reaches a small value, e.g. |x| < C for 
any fixed constant C. We may hope that the 5x + 1 model still makes an accurate prediction 
concerns the question: how many integers reach some small value, for example reaching the 
interval Ixl < C. 



8.5. 5x + 1 RRW Model Prediction: Total Stopping Time Counts 

We can interpret the false prediction above for minimum excursions in a constructive way: as 
soon as a 5x + 1 trajectory achieves a size e^*-" < 1, it enters a periodic orbit. Therefore this 
condition can be treated as a "stopping time" condition that detects when a trajectory reaches 
the value 1. 

Theorem 8.5 (5x + 1 RRW Total Stopping Time Counts) For the 5x + 1 RRW model and 
for a given uj, let 

Soo{^^) := {n > 1 : e^'''" < 1 holds for some k > 1}. 

Collect those seeds n whose trajectory according to iv "reaches 1". Let 7r5(-;ti;) denote the 
corresponding counting function, 

'irrj{x;u!) := ^{1 <n<x:n£ S'oo(<^)}. 

Then 

log7r5(x;u;) 

um = rys rrw, for almost every uj. 

x^oo log x ' 
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Here r/s^/j/jvK ~ 0.65049 is given by ?/5,hrvk = 1 — (^5,RRW where 05,_R_Rvy ~ 0.34951 is the unique 
positive solution to the equation 




M5,RRwiO) := - 2^^ + - =1. (8.11) 



Proof. This can be proved by a large deviations model similar in nature to those considered 
in Lagarias and Weiss [23, Theorem 2.4]. We sketch the main estimate. For k = rlogx, consider 
the probability P{r,x) that for a single random walk e^''''°s^ < 1. Since we make x draws for 
1 < n < X in the repeated random walk, the expected number of such individuals satisfying 
this property will be xP{r,x). This probability is estimated using Chernoff's bound to be 

P{r,x) = exp(-g5,RR,4/(a)rlogx(l + o(l)) , 

^ ' ' We 



8.4 



where a = ^, and g^^RRw is the large deviations rate function (8.10) in Theorem 
now maximize this probability over r. To do this we eliminate r using r = i, so we want to 
determine 

35,RRwia) 
r5,RRW ■= „mm . 

0<a<oo a 

Then we obtain xP{r, x) < 3;-'^~'^5,flflw+o{i) £qj. r, with equality holding for r = where a* 
be the value that attains the maximum of /(a) := ^^'^^^^""^ taken on the positive half-line. The 
extremality conditions for the minimum leads to the condition MiiR\Y{0{a*)) = 1, where is 
the Legendre transform variable, and also to the identity 

_ gb,RRw{a*) _ af^*^ a 
T^5,RRW — 1 — t^{a j ■— ^b,RRW- 

a* 

The strict convexity of the function log Mrrw^O) is used to get a unique minimum, with 
'n5,RRW = 1 — T5,RRW- For a rigorous proof, one must control various error estimates to show 
the dominant contribution to the probability comes from a small region near a* . ■ 



Remark. The value of 05,RRiy in the minimization problem in the proof of Theorem 8.5 turns 



out to be identical to that in the maximization problem that is needed for proving Theorem 8.4 



8.6. 5x + 1 Accelerated Forward Iteration: Brownian Motion. 



Kontorovich and Sinai [18] extended the Structure Theorem (that is. Theorems 5.1 and 5.2) 



and the consequences on the Central Limit Theorem (Theorem |5.4[) and geometric Brownian 



motion (Theorem 5.5) to a class of functions which they called {d,g, /i)-maps. The case d = 2, 



g = 5, and h = 1 corresponds to the accelerated 5x + 1 function, U^{n). 

The analogous distribution and Central Limit Theorems are proved in the same way, leading 
to the following. 
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Theorem 8.6 (Geometric Brownian Motion) The rescaled paths of the accelerated 5x + l map 
are those of a geometric Brownian motion with drift log(|). By this we mean the following. 

For an initial seed xq which is relatively prime to both 2 and 5, denote its iterates by 
Xk '■= U^''\xo), let yk := logx^ and define the scaled variable 



Vk-yo- ^log(f) 



2A;log2 



Partition the interval [0, 1] as = to < ti < ■ ■ ■ < tr = 1, and set kj = [tjk\. Then for any 
aj < bj, j = l,...,r, 



lim ] 



xo : aj < ujkj - < bj, for a// j = 1, 2, . . . , rj = ]J ( <^{bj) - ^>(c 



where ^{a) is the cumulative distribution function for the standard normal distribution. 



Proof. This is a consequence of Theorem 5 in Kontorovich-Sinai [18j . 



Remark. The accelerated drift, log(|), is again double that of the Biased Random Walk 
model, which predicts a drift of ^ log(|). A zero- mean, unit-variance Wiener process Wt satisfies 
the "law of iterated logs" almost surely, that is: 

hm sup . = 1 , 

t^oo v2iloglogt 

with probability 1. Hence the drift being positive implies that almost every 5x + 1 trajectory 
escapes to infinity (yet we emphasize again that we do not know how to prove this for a single 
given trajectory!). 

8.7. 5x + 1 Backwards Stochastic Models: Branching Random Walks 

We next formulate branching random walks to model the 5x + l iteration in exact analogy with 
the 3x + 1 models. We denote these models B[5^] for j > 0. 

5a; -|- 1 Branching Random walk B[5^]. There is one type of individual. With probability 
I an individual has a single offspring located at a position shifted by log 2 on the line from its 
progenitor, and with probability | it has two offspring located at positions shifted log 2 and 
log I on the line from their progenitor. If the progenitor is in generation k — 1, the offspring 
are in generation k. The tree is grown from a single individual in generation 0, the root, with 
specified initial location log a. 

The more general models for j >1 are given as follows. 

5x -|- 1 Branching Random walk B[5^], {j > 1). There are p = 4 ■ types of individuals, 
indexed by residue classes a (mod 5-') with o ^ (mod 5). The distribution of offspring of an 
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individual of type a (mod 5^), at any given generation (or depth) k in the branching, is deter- 
mined as follows: Suppose a (mod 5^) is the type of a node at depth k — 1. Now regard it as 
being, with probability ^ each, one of the five possible residue classes a (mod 5^^^) consistent 
with its class (mod 5^). A tree of depth 1 having a as root node, then has either one or two 
progeny, at depth 1, given by (T*)~^(a), whose node labels are well-defined classes (mod 5^), 
either 2a or, if it legally occurs, ^"^'^ (mod 5^). The branching random walk then produces an 
individual of type 2o at generation k whose position is additively shifted by log 2 from that of 
the generation k — 1 progenitor node of type d plus, if legal, another node of type ^^-^( mod 5-^), 
which is shifted in position by log(|) on the line from that of the generation k — 1-node. The 
tree is grown from a single individual at depth 0, with specified type and location log a. 

Just as in the 3x + 1 branching random walk models, the behavior of the random walk part 
of the model can completely reconstructed from knowing the type of each node. 

For the rest of this section, let uj denote a single realization of such a branching random 
walk B[5^ which starts from a single individual wq,! of type 1 (mod 5-^) at depth 0, with initial 
position label log|a|. Here uj describes a particular infinite tree. We let Ni^^uj) denote the 
number of individuals at level k of the tree. We let S{iOk,j) denote the position of the j'-th 
individual at level k in the tree, for 1 < j < Ni^^uj). 

These models are supercritical branching processes exactly as for the 3x -|- 1 case: In every 
random realization uj, the number of nodes at level d grows exponentially in d, and there are 
no extinction events. 



In terms of growth of trees of inverse iterates, these models will accurately represent certain 
features of 5x -|- 1 trees, and not others. They might accurately describe tree sizes. However 
these branching random walks very likely do not accurately model positions of inverse iterates 
of the 5x + 1 in certain crucial ways. Namely, individuals whose branching walk position is 
negative (corresponding to a 5x -|- 1 iteration value x falling in the interval (0, 1)) are where 



the correction term in (8.4) in the 5x + 1 iteration becomes significant, breaking the size 



connection of the model iterates and the 5x + 1 iterates. 



We now give some quantities of the trees associated to a realization uj of the branching 
random walk ;B[5-']. We let Nk := Ni^^uj) denote the number of individuals in generation k, and 
let {uJk^i : 1 < ^ < A^fc(u;)} denote the set of all individuals in generation i, ordered by their 
branching random walk locations on the line, denoted 

L{uJk,i) < L{i^k,2) < ■ ■ ■ < L{uJk,Nk)- 

The size of the element uj^^i, viewed as analogues of the 5x + l iterates, is the exponentiated 
quantity 

Zfc^i := e^^""*'"). (8.12) 

The branching random walk has the property that the sizes of most individuals in a tree will 
tend to get larger. (This initially seems rather surprising, but note that if a forward orbit is 
unbounded, then necessarily all backward orbits leading to it must be unbounded as well!) We 
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are interested in individuals whose size under the 5x + 1 iteration is around a given value x. 
The tree models will detect individuals whose size is larger than x. 

In the following subsections we address for the 5x + 1 branching random walk models the 
following questions. 

1. What is the exponential growth rate of the quantities Nk{io), as a function of k? 

2. What is the maximum level k that has some individual Z^^i < x? This requires analyzing 
the size of the first birth location L{uJk,i)- 

3. How does the total number of individuals 7r5(x;u;) in the 5a; + 1 ttree having location 
Zk,i ^ X grow as a function of xl 



8.8. Backwards Iteration Prediction: 5x + 1 Tree Counts 

The size of 5x + 1 trees can be estimated for these models B\^^] , as follows. 

Theorem 8.7 (52; + 1 Stochastic Tree Size) For all j > a realization lo of a tree grown in 
the 5x + 1 branching random walk model B[5^] satisfies 

lim - (log A^fefcj)) = log [ - ) , almost surely. (8.13) 
fc^oo k \o J 

Proof. This is proved in exactly similar fashion to the 3x + 1 stochastic model case in Lagarias 
and Weiss |23l Corollary 3.1] - 

This result only uses the Galton- Watson process branching structure built into the branch- 
ing random walk i3[5-']. It does not depend on the sizes of the iterates. 



The conclusion of Theorem 8.7 viewed as a prediction of the growth behavior of 5x + 1 
trees, is consistent with the rigourous results on average tree size for pruned 5x + 1 trees given 
in Theorem 17. 5[ 



8.9. Backwards Iteration Prediction: Extremal Finite Total Stopping Times 

As indicated above, most integers for the 5x + 1 map will not have a finite total stopping time. 
However it is of interest to analyze the small subset of integers that do have a total stopping 
time; these are exactly the integers in the tree of inverse iterates of a = 1. We analyze what is 
the maximum generation k that contains an individual having size e^^^''-'-'^ < x. 

Denote the location of this first birth individual in generation k by -^^^.(w) := L(6<jfc^i), for a 
given realization lo of the random walk. 
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Figure 8.8: A plot of a versus 55,_Bp(a), in the range log(2/5) < a < g log(64/5). 

0* 




Figure 8.9: A plot of a versus 9*, in the range log(2/5) < a < ^ log(64/5). 



Theorem 8.8 (Asymptotic 52; + 1 First Birth Location) There is a constant [3^, bp such that, 
for all j > 1, the branching random walk model B[3^ has asymptotic first birth (leftmost birth) 



lim yLKuj) = (35^BP a. s. 

fc— ►oo K 



i.U) 



This constant (3^^bp ~ 0.01179816 is determined uniquely by the properties that it is the unique 
constant with /? > that satisfies 



where 



95,Bp{^) '■= -sup(ae-logf2^ + ^(^)'' 
e<o V V 5 5 



^.16) 
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Proof. This is proved by an argument analogous to the 3x + 1 case analyzed in Lagarias and 
Weiss |23| Theorem 3.4], cf. Theorem 6.3 Here we use a branching process (inverse) moment 
generating function 

M5,BP{0) ■='^' + ■ (8-17) 

in computing the rate function g^ pp{a). We note that g^^Bpi^) increasing for log | < a < 
^logf , (see Fi gure 8.8^ and on this range the value 9* := 9{a) achieving the extremum in 



(8.16) is an increasing function of a, reaching the value = at the upper endpoint (see Figure 
8.9|). We have g^^Bpia) = log(|) for | log f < a < cx). - 



Now one defines a branching random walk stopping limit 

k 

Ib.Bpij^) ■= limsup 



Theorem 8.8 implies that this value is constant almost surely, equaling a value j^^bp given by 

1 



75, BP 



P5,BP 



84.76012. 



^.18) 



One can show the constants 75,bp and ^^^rrw agree, just as for the 3x + l stochastic models. 

Theorem 8.9 (5x + 1 Random Walk-Branching Random Walk Duality) The 5x + 1 repeated 
random walk (RRW) scaled stopping time limit ^^^rrw CL'^^d the branching random walk stopping 
limit ^5^BP for the 5x + 1 branching random walk (BP) model B[5^] with j = 0, are related by 



Proof. This result is proved using a relation between moment generating functions 

M^^BP{0) = M^^RRw{e + I), 



(8.19) 



compare (8.11) and (8.16). It is identical in spirit to the proof in Lagarias and Weiss 
Theorem 4.1]. ■ 



The analogue of this result applied to the 5x + 1 problem would be the following heuristic 
prediction: For any constant 7 > 75, bp all but finitely many trajectories having total stopping 
time CTooin) > 7logn necessarily have croo{n) = +00. We could take 7 = 85, for example. 



8.10. Backwards Iteration Prediction: Total Preimage Counts 

The following result gives, for the simplest branching random walk model ;B[5'^], an almost sure 
asymptotic for the number of inverse iterates of size below a given bound. 

Theorem 8.10 (Stochastic Inverse Iterate Counts) For a realization to of the branching ran- 
dom walk i3[l], let I*{t;uj) count the number of progeny located at positions Z{ujkj) < x, i.e. 

r{x-Lo) := #{a;fcj : Z{uJk,j) < xjor any k>l, l<j< Nk{uj)}. (8.20) 
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This quantity satsfies with probability one the asymptotic estimate 

7*(3;;w) = x''-^>sp+o(i) as x^oo, (8.21) 

in which tj^^bp ^ 0.650919 is the maximum value of f{a) := -g^^Bpi'^) to-ken over the interval 
0<a<|logf. 

Proof. This is proved by a large deviations argument similar to that used in Lagarias and 
Weiss |23| Theorem 4.2]. One counts the number of progeny at level k for each level k sat- 
isfying the bound, by estimating the probability that a random leaf satisfies the appropri- 
ate large deviations bound. One shows that this number peaks for k ~ ^s^^plogx, where 
(^5, BP — ^ 9.19963, where a* ~ 0.1087 is the value of a achieving the maximum above. One 
shows that the right side is an upper bound for all levels k, and that the sum total of levels 
k > 100 log X contribute negligibly to the sum. ■ 



The model statistic I*{x; uj) functions as a proxy for the 5x -|- 1 count function vr*(x), where 
log|a| gives the position of the root node of the branching random walk. This result is the 
stochastic analogue of Conjecture 2.1 about the 3a; -|- 1 growth exponent. The argument above 
also makes the prediction is that the levels k at which the bulk of the members of iTa{x) occur 
has k K, ^ logx. 



Remark. An entirely different set of branching random walk models has been developed by S. 
Volkov [40 to model the 5x -|- 1 problem. Volkov models counting all non-divergent trajectories 
of the 5x -|- 1 problem, which are those which enter some finite cycle, and denotes the number 
of these below x by Q{x). Thus tt^{x) < Q{x), and conjecturally these should be of similar 
orders of growth. It is expected there are finitely many cycles, and each should absorb roughly 
the same number of integers below x, in the sense of the exponent in the power of x involved. 

Volkov's branching process stochastic models grow a complete binary tree, rather than a 
tree that may have either one or two branches from each node, as in the models above. He 
suggests that the 5x -|- 1 problem can be modeled by such trees, using an unusual encoding 
of the iterates (some edges encode several iteration steps of the inverse CoUatz function). In 
order to do this, his node weights are chosen differently than above. He arrives at a predicted 
exponent rj"^ bp ~ 0.678, which differs from the prediction rj^^sp ~ 0.650919 made in Theorem 



8.10 above. The empirical data Volkov presents seems insufficient to discriminate between these 



two predicted exponents. It would be interesting for this problem to be investigated further. 



9. Benford's Law for 3x + 1 and 5x + 1 Maps 

Another curious statistic satisfied by the 3x -|- 1 function was discovered by Kontorovich and 
Miller fl^: Benford's Law. 

In the late 1800s, Newcomb [29] noticed a surprising fact while perusing tables of logarithms: 
certain pages were significantly more worn than others. Numbers whose logarithm started with 
1 were being referenced more frequently than other digits. Instead of observing one-ninth (about 
11%) of entries having a leading digit of 1, as one would expect if the digits 1,2, ... ,9 were 
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equally likely, over 30% of the entries had leading digit 1, and about 70% had leading digit less 
than 5. Since log^g 2 ^ 0.301 and log^Q 5 « 0.699, Newcomb speculated that the probability 
of observing a digit less than k was logj^g ^- This logarithmic phenomenon became known as 
Benford's Law after Benford [6; collected and in 1938 popularized extensive empirical evidence 
of this distribution in diverse data sets. 

Benford's law seems to hold for many sequences of numbers generated by dynamical systems 
having an "expanding" property, see Berger et al f7] and Miller and Takloo-Bighash [i28j Chap. 
9] . Benford behavior has been empirically observed for initial digits of the first iterates of the 
3x + 1 map or accelerated 3a; + 1 map for a randomly chosen initial number n. Here we survey 
some rigorous theorems quantifying this statement, for initial iterates. Similar Benford results 
can be proved for the 5x + 1 function. 

We emphasize that the Benford law behavior quantifed here concerns behavior on a fixed 
finite set of initial iterates of these maps. Indeed, the 3x + 1 conjecture predicts that Benford 
behavior cannot hold for the full infinite set of forward iterates, since conjecturally they become 
periodic! However it remains possible that a strong form of Benford behavior could hold on 
(infinite) divergent orbits of the 5x + 1 problem. 



9.1. Benford's Law and Uniform Distribution of Logarithms 

To make Benford's law precise, we say that the mantissa function M{n) G [1, 10) is the leading 
entry of n in "scientific notation", that is, n = 7W(n) • 10Li°gio"J. Benford's law concerns the 
distribution of leading digit of the mantissa, while one can also consider the distribution of the 
lower order digits of the mantissa. 

Definition 9.1 An infinite sequence {ni, ?i2, . . . , n^, . . .} satisfies the strong Benford's Law (to 
base 10) if the logarithmic digit frequency holds for any order digits in the mantissa. That is, 
for any a G [1, 10), 

< X : M{nk) < a} 1 / N i ^ 

hm =logio(a. (9.1 

The strong version of Benford's law is well known to be equivalent to uniform distribution 
mod 1 of the base 10 logarithms of the numbers in the sequence, cf. Diaconis ^15^ Theorem 1]. 

Theorem 9.1 (Strong Benford Law Criterion) A sequence {ni,n2, ■ ■ ■} satisfies the strong Ben- 
ford's Law (or "is strong Benford") to base 10 if and only if the sequence {log;^o '^i? logio ^^2; • • •} 
is equidistributed (mod 1), that is, for any a £ [0, 1), 

^.^ < X : logio nfc(mod 1) < a} ^ ^ 

x—^oc X 

The definition and theorem above extend to expansions in any integer base B > 2. This 
result suggests the following general definition of strong Benford's Law to any real base B > 1. 

Definition 9.2 Let S > 1 be a real number. A sequence {ni, n2, . . . , n^, . . .} satisfies the 
strong Benford's Law to base B if and only if the sequence {log^(ni), log^(n2), ...} is uniformly 
distributed modulo one. 
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This definition is equivalent to the earher one for integers expanded in a radix expansion 
to any base B > 1. One can similarly define the mantissa function to any real base B > 1, 
extending Definition 9.1 



Benford's Law is stated for infinite sequences. However one can obtain approximate results 
that apply to finite sequences {xi, X2, x^}, by using the following discrepancy measure of 
approximation to uniform distribution of such sequences. 

Definition 9.3 Given a finite set y = {yi, . . . , yk} of size k, for each < a < 1, set 

^ #{j < k : ^,(mod 1) < g} _ 
k 

The discrepancy "D^y) is defined by 

V{y):= sup V{y;a)- inf V{y;a). 

0<a<l 0<a<l 

One always has "D^y) < 1. The smallest possible discrepancy of a finite set 3^ is "D^y) = 1/k, 
attained by equally spaced elements yj = j^, I < j < k. 

A small discrepancy indicates that the set y is close to equi distributed modulo 1. In 
particular, for an infinite sequence X = {xj : j > 1}, if = {xj : 1 < j < k} then X is 
uniformly distributed (mod 1) if and only if the discrepancies D^X^) as k ^ oo. 



9.2. Benford's Law for 3a; + 1 Function Iterates 



Kontorovich and Miller \i7j considered iterates of the accelerated 3a; + 1 function U{n). Fix 
an odd integer n = no, and let {ni,n2, . . .} be the sequence of iterates from the starting seed 
no G n, where 11 consists of all positive integers relatively prime to 6. The main 3x+l conjecture 



asserts that this sequence is eventually periodic, and hence it is impossible for (9.2) to hold! 



The following was their interpretation of (weak) "Benford behavior" for the Sx + 1 function: 



Theorem 9.2 For xo = n G 11, denote its accelerated 3x + 1 iterates by x^ ■- 
set yk := logio o-nd define the shifted variables 



[/W(xo). Now 



i^k ■= Vk-yo- klog 
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Then, for any a € [0, 1) 



Hm Dn 



xo : Wfe(mod 1) < a 



Proof. This is established as Theorem 5.3 in Kontorovich and Miller [T^. ■ 

Arguably, the normalization from yk to uj^ in Theorem |9.2| makes the above result only an 
approximation to "true" Benford behavior, which should be that Dn[xo : yk (mod 1) < a] ^ a 
as k ^ oo. 
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Lagarias and Soundararajan |22j were able to use the non-accelerated 3rc + 1 function T to 
show another approximation to Benford behavior, as follows. 

Theorem 9.3 (Approximate Strong Benford's Law for 3x + 1 Map) Let B > 1 be any integer 
base. Then for a given N > 1 and each X > 2^ , most initial starting values xq in 1 < xq < X 
have first N initial 3x + 1 iterates {xk ■ 1 < k < N} that satisfy the discrepancy bound 

V {{logs Xk{mod 1) ■.l<k<N}) < IN'^. (9.3) 

The exceptional set £{X, B) of initial seeds xq in 1 < xq < X that do not satisfy the bound has 
cardinality 

\£{X,B)\<c{B)N-h (9.4) 
where c{B) is a positive constant depending only on the base B. 

Proof. This is established as Theorem 2.1 in Lagarias and Soundararajan [22]. ■ 



9.3. Benford's Law for 5x + 1 Function Iterates 

The bx + 1 map also exhibits similar "Benford" behavior for its iterates. The results of |17j 
apply to general (d, 51, /i)-Maps, in particular, to the bx + 1 function, giving a direct analogue 
of Theorem 



The method of proof in [22 of Theorem 9.3 should also extend to give qualitatively similar 
results in the 5x + 1 case. This proof relied on the Parity Sequence Theorem for the 3x + 1 map 
which has an exact analogue for the 5x + 1 map. The proof in |^ also used some Diophantine 
approximation results for the transcendental number 03 := log2 3, and qualitatively similar 
Diophantine approximation results are valid for 05 := log2 5 needed in the 5x + 1 case. 



These rigorous results concern only the initial iterates of 5x + 1 trajectories. However since 
the 5x + 1 map conjecturally has divergent orbits, it seems a plausible guess that a strong form 
of Benford behavior might hold on all infinite divergent orbits of the 5x + 1 map. 



10. 2-Adic Extensions of 3x + 1 and 5s + 1 Maps 

What happens if we put these probabilistic models in a more general context? We can obtain a 
perfect set of symbolic dynamics if we extend the domain of these maps to the 2-adic integers. 
Such extensions are possible for both the 3x + 1 map ^^(x) and the 5x + 1 map T^{x). 

Theorem 10.1 The 3a; + 1 map T3 and the 5a; + 1 map T^ extend continuously from maps on 
the integers to maps on the 2-adic integers Z2, viewing X as a dense subset 0/Z2. Denoting the 
extensions by T3 and T5, respectively, these maps have the following properties, 
(i) Both maps T3 and T^are homeomorphisms 0/Z2 to itself. 

(a) Both maps T3 and T5 are measure-preserving maps on Z2 for the standard 2-adic measure 
IJL2 on Z2. 

(Hi) Both maps and T^ are strongly mixing with respect to the measure 112, hence ergodic. 
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Proof. For the 3a; + 1 map, properties (i)-(iii) are stated in Lagarias |2H Theorem K]. The 
property of strong mixing is an ergodic-theoretic notion explained there. Akin [T] gives another 
proof of these facts for the 3x + 1 map. 

For the 5a; + 1 map, properties (i)-(iii) may be estabhshed by proofs similar to the 3a; + 1 
map case. This is based on the fact that an analogue of Theorem |2.1| holds for the symbolic 
dynamics of iterating the 5x + 1 map. It is also a corollary of results of Bernstein and Lagarias 
[91 Sect. 4], whose results imply that (i)-(iii) hold more generally for all ax + 6-maps. Here the 
ax + b map Tafi is 



Ta,b{x) :-- 



ax + b 
2 

X 

2 



if X = 1 (mod 2) 
if X = (mod 2) 



where a and b are odd integers. 



A much stronger ergodicity result is valid for the 2-adic extensions of these maps. Define 

■oc 



the 2-adic shift map 5" : Z2 — > Z2 to be the 2-to-l map given for a = X^fco^j^"' ~ .000102... 



with each Oj = or 1, by 

S{a) = S{.aoaia2 •••)•= -010203 • • • 

That is, 

{a — 1 
if a = 1 (mod 2) 
^ (10.1) 
I if a = (mod 2). 

This map has the 2-adic measure as Haar measure, and is mixing in the strongest sense. 

Theorem 10.2 The 2-adic extensions T3 of the 3x + 1 map and T5 of the 5a; + 1 map are 
each topologically conjugate to the 2-adic shift map, by a conjugacy map <I>3, resp. That is, 
these maps are homeomorphisms 0/Z2 with o T3 o <I>3 = 5 and $5 o (r)5 o = S. 

(1) The maps ^j, j = 3 or 5, are solenoidal, i.e. for each n > 1 they have the property 

x = y (mod 2") — > ^>j(x) = ^>j(y) (mod 2"). 

(2) The inverses of these conjugacy maps are explicitly given by 

00 

^=E (i^od 2)) 2^ 

k=0 

for J = 3 or 5, and the residue (mod 2) is taken to be or 1. 

Proof. These results follow from Bernstein and Lagarias ^ Sect. 3, 4], where results are 
proved for a general class of mappings including both the 3x + 1 map and 5x + 1 map. ■ 



Theorem 10. 2| immediately gives the following corollary. 
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Corollary 10.1 The 2-adic extensions of the 3x + 1 map and of the 5x + 1 map are 
topologically conjugate and metrically conjugate maps. 

The corollary shows that from the viewpoint of extensions to the 2-adic integers, the 3x + 1 
maps and the 5x + l maps have identical ergodic theory properties, i.e. they are both conjugate 
to the shift map. That is, their symbolic dynamics is "the same" in the topological sense, and 
their dynamics is also identical in the measure-theoretic sense. 

The original 3x -|- 1 problem (resp. 5x + l problem) concerns their behavior when restricted 
to the dense set Z inside Z2. This set Z is countable, so has 2-adic measure zero, so the general 
properties of ergodic theory allow no conclusion to be drawn about behavior of iteration on 
these maps on Z. Indeed empirical data and the stochastic models above show that the dynam- 
ics of iteration of the 3x + 1 map and 5a; -|- 1 map are "not the same" on Z. 

To conclude, we remark that the two accelerated functions U3 and also make sense 2- 
adically, in a restricted domain. Let Zg = {a G Z2 : a = 1 (mod 2)}. We have U3 : Z2 ^ 
Z2 U {0} (in the latter case we set U{-\) = 0.) and f/s : Z2 ^ Z2 U {0} (in the latter case we 
set U{-1) = 0.) It might prove worthwhile to find invariant measures for these functions, and 
to study their ergodic-theoretic behavior. 

11. Concluding Remarks 

We have presented results on stochastic models simulating aspects of the behavior of the 3x+l 
function and 5x + 1 problems. These models resulted in specific predictions about various 
statistics of the orbits of these functions under iteration, which can be tested empirically. The 
experimental tests done so far have generally been consistent with these predictions. 

11.1. Compcirisons 

We compare and contrast the behavior of these two maps under iteration. The 3x + l map and 
5x + 1 map are similar in the following dimensions. 

1. {Symbolic dynamics) The allowed symbolic dynamics of even and odd iterates is the same 
for the 3x + 1 and 5x + 1 maps. Every finite symbol sequence is legal. 

2. {Periodic orbits on the integers) Conjecturally, both the 3x + l map and 5a;-|-l maps have 
a finite number of distinct periodic orbits on the domain Z. 

3. {Periodic orbits on rational numbers with odd denominator) Every possible symbolic dy- 
namics for a periodic orbit is the periodic orbit for some rational starting point, for both 
the 3x + 1 map and 5x -\- 1 map. That is, extensions of the maps T3 and T5 to rational 
numbers with odd denominator each have 2^ periodic orbits of period p, for each p > 1. 
Here the period p may not be the minimal period of the orbit, so a period k orbit is also 
counted as a period p = kn orbit for each A; > 1. 
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4. [Benford Law behavior) Both the initial 3x + 1 function iterates of a random starting 
point, and the initial 5x + 1 iterates of a random starting point, with high probability 
exhibit strong Benford law behavior to any integer base B >2. 

5. {2-adic extensions) The 2-adic extensions of the two maps are topologically and metrically 
conjugate. Therefore they have the same dynamics in the topological sense, and in the 
ergodic theory sense, on the domain Z2. 

The main differences between the 3x + 1 maps and fyx + l maps concerns the change in size 
of their inter ates. 

1. {Short-term behavior of iterates) For the 3a; + 1 map, the initial steps of most orbits shrink 
in size, while for the bx + \ map most orbits expand in size. This is rigorously quantified 
in ^and ^ 

2. {Long-term behavior of iterates) The 3x + 1 and 5x + 1 conjecturally differ greatly in 
their long-term behavior of orbits on the integers. For the 3x + 1 map, conjecturally all 
orbits are bounded. For the 5x + 1 map, conjecturally a density one set of integers have 
unbounded orbits. 

It is the long term behavior of iterates where all the difficulties connected with the 3x + 1 
and bx + 1 function lie. 

11.2. Insights 

Comparison of the results of these stochastic models, combined with deterministic results, de- 
liver certain insights in understanding the 3x -|- 1 and bx + \ problem, and suggest topics for 
further work. 

First, the 2-adic results indicate that the differences in of the dynamics of the 3x -|- 1 map 
and 5a; -|- 1 map on the integers are invisible at the level of measure theory. Therefore these 
differences must depend in some way on number-theoretic features inside the integers Z. 

Second, the behavior of the iteration of these function of in Z, viewed inside the 2-adic 
framework, must be encoded in the specific properties of the conjugacy maps $3 and $5 iden- 
tifying these maps with the 2-adic shift map. Here we note that there is an explicit formula 
for the 3x -|- 1 conjugacy map, obtained by Bernstein [8] , and there is an analogous formula for 
the 5a; -|- 1 conjugacy map as well. These conjugacy maps have an intricate structure, detailed 
in [9], which might be worthy of further investigation. 

Third, we observe that the ergodic behavior of the 2-adic extensions is exactly the behavior 
that served as a framework to formulate the random walk models presented in ^ ^ and 
^ These random walk models yield information by combining these model iterations with 
estimates of the size of iterates in the standard absolute value on the real line E. That is, 
they use information from an archimedean norm, rather than the non-archimedean norm on 
the 2-adic integers. Perhaps one needs to consider models that incorporate both norms at once. 
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e.g. functions on M x Z2. 

Fourth, a suitable maximal domain, larger than Z, on which to understand the difference 
between the 3x + 1 map T3 dynamics and the 5x + l map dynamics appears to be the domain 

Q(2) :=QnZ2, 

i.e. the set of rational numbers that are 2-adic integers. The set Q(2) is exactly the set of 
rational numbers having an odd denominator, and both T3 and T5 leave the set Q(2) invariant. 
This set includes all periodic orbits of both T3 and T5, and from the viewpoint of existence of 
periodic orbits, these two maps are the same on Q(2)- The difference in the dynamics of these 
maps on Z seems to have something to do with the distribution of these periodic orbits. Viewing 
Q(2) as having the topology induced from the 2-adic topology, one may conjecture that T3 and 
T5 are not topologically conjugate mappings on this domain. 

Fifth, the 5x + 1 map exhibits various "exceptional" behaviors. Although almost all of its 
integer orbits (conjecturally) diverge, nevertheless there exists an infinite exceptional set of in- 
tegers that have eventually periodic orbits. The density (fractional dimension) of such integers 
is predicted (conjecturally) to be a constant 85 ^ 0.649, solving a large deviations functional 
equation. This seems a hard problem to resolve rigorously. Now, for the 3x -|- 1 map, a similar 
prediction is made by the models for the growth constant g = 1. It too is the solution of a large 
deviations functional equation. We currently know that 1 > g > 0.84. This analogy suggests 
that rigorously proving that the growth constant 83 = 1 may turn out to be a much harder 
problem than it seems at first glance. 

Sixth, we note that there are extensions of the maps for backwards iteration to larger do- 
mains, to the invertible 3-adic integers Z3 for the 32; -|- 1 map, and to the invertible 5-adic 
integers Z5 for the 5c -|- 1 map. In effect the branching random walk models may fruitfully be 
extended to allowing root node labels that are invertible 3-adic integers (resp. 5-adic integers), 
and this provides enough information to grow the entire infinite tree. Various interesting prop- 
erties of the extended 3x -|- 1 trees obtained this way have been obtained, cf. [4] . This is a topic 
worth further investigation. 
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